Research & Thoughts

Showing 56 results
Reward-Based Token Modelling with Selective Cloud Assistance
Reward-Based Token Modelling with Selective Cloud Assistance
  • Inference

This method not only reduces the traffic to the cloud LLM, thereby lowering costs, but also allows for flexible control over response quality depending on the reward score threshold.

Dataset for Advancing Academic Knowledge and Machine Reasoning
Dataset for Advancing Academic Knowledge and Machine Reasoning
  • Dataset

With a composition of 11.53 billion tokens, integrating 8.01 billion tokens of synthetic data with 3.52 billion tokens of rich textbook data, Intellecta is crafted to foster advanced reasoning and comprehensive educational narrative generation.

Inference Acceleration for Large Language Models on CPUs
Inference Acceleration for Large Language Models on CPUs
  • Inference

In this paper, we explore the utilization of CPUs for accelerating the inference of large language models.