Research Papers Scaling Laws Chinchilla Research Training

Scaling Laws for Neural Language Models: New Evidence Challenges Chinchilla Predictions

ArXiv / Epoch AIby Epoch AI ResearchMarch 23, 20269 min read16,200 views

New empirical research from Epoch AI challenges the Chinchilla scaling laws, suggesting that compute-optimal training requires significantly more tokens than previously believed, with implications for how frontier models should be trained.

Researchers at Epoch AI have published new empirical evidence that challenges the widely-adopted Chinchilla scaling laws, which have guided the training of most frontier language models since 2022. The new research suggests that compute-optimal training requires substantially more training tokens than the Chinchilla predictions indicate, particularly at large compute budgets.

The Chinchilla paper, published by DeepMind in 2022, established that optimal model training requires approximately 20 training tokens per model parameter. This finding led to a shift in the field toward training smaller models on more data, with models like Llama 2 and Mistral following this prescription.

The new Epoch AI research, based on a comprehensive analysis of training runs across multiple organizations, finds that the optimal token-to-parameter ratio increases significantly at larger compute scales. At the compute budgets now used for frontier models, the optimal ratio may be closer to 50-100 tokens per parameter.

If confirmed, these findings would suggest that current frontier models are significantly undertrained relative to their optimal configuration. The implications are significant: achieving optimal performance from a given compute budget may require training for longer on more data rather than scaling model size.

Original source

ArXiv / Epoch AI

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

Scaling LawsChinchillaResearch

Products

Perplexity AI Launches Deep Research Feature Competing Directly with OpenAI

Perplexity's Deep Research conducts multi-step web searches, synthesizes information from dozens of sources, and produces comprehensive research reports in minutes, challenging OpenAI's o3-powered research assistant.

Perplexity AI

4m7 days ago

Frontier Research

Causal AI Breakthrough: New Framework Enables Models to Reason About Counterfactuals

Researchers at MIT and Stanford introduce CausalBench, a framework enabling LLMs to perform genuine causal reasoning and counterfactual analysis, moving beyond correlation-based pattern matching.

ArXiv

8m7 days ago

Analyst News

Import AI 450: China's electronic warfare model; traumatized LLMs; and a scaling law for cyberattacks

How will timeless minds value time?

Import AI

12m7 days ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 339 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersRecent

Semi-Automated Knowledge Engineering and Process Mapping for Total Airport Management

arXiv:2603.26076v1 Announce Type: new Abstract: Documentation of airport operations is inherently complex due to extensive technical terminology, rigorous regulations, proprietary regional information, and fragmented communication across multiple stakeholders. The resulting data silos and semantic inconsistencies present a significant impediment to the Total Airport Management (TAM) initiative. This paper presents a methodological framework for constructing a domain-grounded, machine-readable Knowledge Graph (KG) through a dual-stage fusion of symbolic Knowledge Engineering (KE) and generative — Darryl Teo, Adharsha Sam, Chuan Shen Marcus Koh, Rakesh Nagi, Nuno Antunes Ribeiro

arXiv

2mabout 16 hours ago

Research PapersRecent

BeSafe-Bench: Unveiling Behavioral Safety Risks of Situated Agents in Functional Environments

arXiv:2603.25747v1 Announce Type: new Abstract: The rapid evolution of Large Multimodal Models (LMMs) has enabled agents to perform complex digital and physical tasks, yet their deployment as autonomous decision-makers introduces substantial unintentional behavioral safety risks. However, the absence of a comprehensive safety benchmark remains a major bottleneck, as existing evaluations rely on low-fidelity environments, simulated APIs, or narrowly scoped tasks. To address this gap, we present BeSafe-Bench (BSB), a benchmark for exposing behavioral safety risks of situated agents in functional — Yuxuan Li, Yi Lin, Peng Wang, Shiming Liu, Xuetao Wei

arXiv

2mabout 16 hours ago

Research PapersRecent

AutoB2G: A Large Language Model-Driven Agentic Framework For Automated Building-Grid Co-Simulation

arXiv:2603.26005v1 Announce Type: new Abstract: The growing availability of building operational data motivates the use of reinforcement learning (RL), which can learn control policies directly from data and cope with the complexity and uncertainty of large-scale building clusters. However, most existing simulation environments prioritize building-side performance metrics and lack systematic evaluation of grid-level impacts, while their experimental workflows still rely heavily on manual configuration and substantial programming expertise. Therefore, this paper proposes AutoB2G, an automated b — Borui Zhang, Nariman Mahdavi, Subbu Sethuvenkatraman, Shuang Ao, Flora Salim

arXiv

2mabout 16 hours ago

Research PapersRecent

GUIDE: Resolving Domain Bias in GUI Agents through Real-Time Web Video Retrieval and Plug-and-Play Annotation

arXiv:2603.26266v1 Announce Type: new Abstract: Large vision-language models have endowed GUI agents with strong general capabilities for interface understanding and interaction. However, due to insufficient exposure to domain-specific software operation data during training, these agents exhibit significant domain bias - they lack familiarity with the specific operation workflows (planning) and UI element layouts (grounding) of particular applications, limiting their real-world task performance. In this paper, we present GUIDE (GUI Unbiasing via Instructional-Video Driven Expertise), a traini — Rui Xie, Zhi Gao, Chenrui Shi, Zirui Shang, Lu Chen, Qing Li

arXiv

2mabout 16 hours ago