Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AIYour AI Agent Did Something It Wasn't Supposed To. Now What?Dev.to AIThe Model You Love Is Probably Just the One You UseO'Reilly Radar3 of Your AI Agents Crashed and You Found Out From CustomersDev.to AIYour AI Agent Is Running Wild and You Can't Stop ItDev.to AIYour AI Agent Spent $500 Overnight and Nobody NoticedDEV CommunityWhy Software Project Estimates Are Always Wrong (And How to Fix It)DEV CommunityHow to Build a Responsible AI Framework for Transparent, Ethical, and Secure AppsDev.to AIImportance of Inventory Management in 2026 (Complete Guide)Dev.to AIHow Do We Prove We Actually Do AI? — Ultra Lab's Technical Transparency ManifestoDEV Community我让一个 AI agent 在 AgentHansa 工作了两天 — 赚了 7 美元,学到了这些Dev.to AI10 лучших нейросетей для создания видео бесплатно: пошаговый гайдDev.to AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AIYour AI Agent Did Something It Wasn't Supposed To. Now What?Dev.to AIThe Model You Love Is Probably Just the One You UseO'Reilly Radar3 of Your AI Agents Crashed and You Found Out From CustomersDev.to AIYour AI Agent Is Running Wild and You Can't Stop ItDev.to AIYour AI Agent Spent $500 Overnight and Nobody NoticedDEV CommunityWhy Software Project Estimates Are Always Wrong (And How to Fix It)DEV CommunityHow to Build a Responsible AI Framework for Transparent, Ethical, and Secure AppsDev.to AIImportance of Inventory Management in 2026 (Complete Guide)Dev.to AIHow Do We Prove We Actually Do AI? — Ultra Lab's Technical Transparency ManifestoDEV Community我让一个 AI agent 在 AgentHansa 工作了两天 — 赚了 7 美元,学到了这些Dev.to AI10 лучших нейросетей для создания видео бесплатно: пошаговый гайдDev.to AI

On Neural Scaling Laws for Weather Emulation through Continual Training

arXivMarch 26, 202610 min read0 views
Source Quiz

Neural scaling laws, which in some domains can predict the performance of large neural networks as a function of model, data, and compute scale, are the cornerstone of building foundation models in Natural Language Processing and Computer Vision. We study neural scaling in Scientific Machine Learning, focusing on models for weather forecasting. To analyze scaling behavior in as simple a setting as possible, we adopt a minimal, scalable, general-purpose Swin Transformer architecture, and we use continual training with constant learning rates and periodic cooldowns as an efficient training strat — Shashank Subramanian, Alexander Kiefer, Arnur Nigmetov

View PDF HTML (experimental)

Abstract:Neural scaling laws, which in some domains can predict the performance of large neural networks as a function of model, data, and compute scale, are the cornerstone of building foundation models in Natural Language Processing and Computer Vision. We study neural scaling in Scientific Machine Learning, focusing on models for weather forecasting. To analyze scaling behavior in as simple a setting as possible, we adopt a minimal, scalable, general-purpose Swin Transformer architecture, and we use continual training with constant learning rates and periodic cooldowns as an efficient training strategy. We show that models trained in this minimalist way follow predictable scaling trends and even outperform standard cosine learning rate schedules. Cooldown phases can be re-purposed to improve downstream performance, e.g., enabling accurate multi-step rollouts over longer forecast horizons as well as sharper predictions through spectral loss adjustments. We also systematically explore a wide range of model and dataset sizes under various compute budgets to construct IsoFLOP curves, and we identify compute-optimal training regimes. Extrapolating these trends to larger scales highlights potential performance limits, demonstrating that neural scaling can serve as an important diagnostic for efficient resource allocation. We open-source our code for reproducibility.

Comments: ICLR Foundation Models for Science Workshop 2026, 19 pages, 13 figures

Subjects:

Machine Learning (cs.LG)

Cite as: arXiv:2603.25687 [cs.LG]

(or arXiv:2603.25687v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.25687

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Shashank Subramanian [view email] [v1] Thu, 26 Mar 2026 17:37:25 UTC (4,967 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
On Neural S…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 200 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers