Research Papers research paper arxiv ai artificial-intelligence

The Unreasonable Effectiveness of Scaling Laws in AI

arXivMarch 31, 202610 min read0 views

arXiv:2603.28507v1 Announce Type: cross Abstract: Classical AI scaling laws, especially for pre-training, describe how training loss decreases with compute in a power-law form. Their effectiveness has a basic and very practical sense: they make progress predictable, albeit at a declining rate. Yet their effectiveness is also unreasonable in two further senses. First, these laws are largely empirical and observational, but they appear repeatedly across model families and increasingly across training-adjacent regimes. Second, despite the diminishing returns they predict, progress in practice has — Chien-Ping Lu

View PDF HTML (experimental)

Abstract:Classical AI scaling laws, especially for pre-training, describe how training loss decreases with compute in a power-law form. Their effectiveness has a basic and very practical sense: they make progress predictable, albeit at a declining rate. Yet their effectiveness is also unreasonable in two further senses. First, these laws are largely empirical and observational, but they appear repeatedly across model families and increasingly across training-adjacent regimes. Second, despite the diminishing returns they predict, progress in practice has often continued through rapidly improving efficiency, visible for example in falling cost per token. This paper argues that both features arise from the same source: scaling laws are unusually effective because they abstract away from many realization details. The compute variable is best understood as logical compute, an implementation-agnostic notion of model-side work, while the practical burden of scaling depends on how efficiently real resources are converted into that compute. This abstraction helps explain both why the laws travel so well across settings and why they give rise to a persistent efficiency game in hardware, algorithms, and systems. Once efficiency is made explicit, the main practical question becomes how many efficiency doublings are required to keep scaling productive despite diminishing returns. Under that view, diminishing returns are not only a geometric flattening of the loss curve, but also rising pressure for cost reduction, system-level innovation, and the breakthroughs needed to sustain Moore-like efficiency doublings.

Comments: 8 pages, 1 figure

Subjects:

Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.28507 [cs.LG]

(or arXiv:2603.28507v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.28507

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Chien-Ping Lu [view email] [v1] Mon, 30 Mar 2026 14:42:53 UTC (10 KB)

Original source

arXiv

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Research Papers

How a Nonprofit Transforms Data with Cloudera and AI

The organization developed data pipelines that extract and structure information from various scientific sources, significantly accelerating the research process.

AI Business

1m13 days ago

Research Papers

Scientists should use AI as a tool, not an oracle

How AI hype leads to flawed research that fuels more hype

AI Snake Oil

1malmost 2 years ago

Models

New paper: AI agents that matter

Rethinking AI agent benchmarking and evaluation

AI Snake Oil

1mover 1 year ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 159 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research Papers

How a Nonprofit Transforms Data with Cloudera and AI

The organization developed data pipelines that extract and structure information from various scientific sources, significantly accelerating the research process.

AI Business

1m13 days ago

Research Papers

Scientists should use AI as a tool, not an oracle

How AI hype leads to flawed research that fuels more hype

AI Snake Oil

1malmost 2 years ago

Research Papers

Start reading the AI Snake Oil book online

The book was published September 2024

AI Snake Oil

1mover 1 year ago

Research Papers

Alibaba Poaches Google DeepMind Research Scientist For Qwen AI Push - Yahoo Finance

<a href="https://news.google.com/rss/articles/CBMijwFBVV95cUxOYTZwZk0walRzazJQampab1FCM2k4Uy1SYk12UWZraENkUXYzZU9kbnlGTGZJS0pFaTZIUFlKZFkwVnJkRzhKbXhNV3lNdUZpdF8tSU1LMklqcTZlUDZERDZ3VzdWbjNQYUN4T2d2ZkRQT1R1MUc0LXdYNndPQTNzbXBXMXJhb3ZEZE00ZFMtaw?oc=5" target="_blank">Alibaba Poaches Google DeepMind Research Scientist For Qwen AI Push</a> <font color="#6f6f6f">Yahoo Finance</font>

Google News: DeepMind

1m25 days ago