Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessSources: Amazon's "sponsored prompts" for its Rufus AI assistant are yielding significantly lower traffic than traditional ads, but are more cost-effective (Catherine Perloff/The Information)TechmemeNintendo is weathering the stormThe Verge AIJPMorgan's Jamie Dimon predicts AI will cut the working week to 3.5 days, cure cancers, and free up time for hobbiesBusiness InsiderThe career ladder is fading as AI reshapes work, LinkedIn exec saysBusiness InsiderToday is the final day to save up to $150 on a PS5 before the price goes upThe Verge AI11 Easter dinner side dishes you can make in a slow cookerBusiness InsiderIntel agrees to pay $14.2B to repurchase Apollo's 49% stake in the Fab 34 joint venture in Leixlip, Ireland, and plans to issue $6.5B in new debt to fund it (Ian King/Bloomberg)TechmemeAnthropic Publishes Official Skills Guide — How It Compares to Soul SpecDEV CommunityEngineering DDoS Resilience at Scale — How ArzenLabs Designs Protection Beyond 200 TbpsDEV CommunityBacktrader vs VnPy vs Qlib: A Deep Comparison of Python Quant Backtesting Frameworks (2026)DEV CommunityWaaseyaa governance seriesDEV CommunityThe audit that started everything: how Waaseyaa designed an invariant-driven architectural reviewDEV CommunityBlack Hat USADark ReadingBlack Hat AsiaAI BusinessSources: Amazon's "sponsored prompts" for its Rufus AI assistant are yielding significantly lower traffic than traditional ads, but are more cost-effective (Catherine Perloff/The Information)TechmemeNintendo is weathering the stormThe Verge AIJPMorgan's Jamie Dimon predicts AI will cut the working week to 3.5 days, cure cancers, and free up time for hobbiesBusiness InsiderThe career ladder is fading as AI reshapes work, LinkedIn exec saysBusiness InsiderToday is the final day to save up to $150 on a PS5 before the price goes upThe Verge AI11 Easter dinner side dishes you can make in a slow cookerBusiness InsiderIntel agrees to pay $14.2B to repurchase Apollo's 49% stake in the Fab 34 joint venture in Leixlip, Ireland, and plans to issue $6.5B in new debt to fund it (Ian King/Bloomberg)TechmemeAnthropic Publishes Official Skills Guide — How It Compares to Soul SpecDEV CommunityEngineering DDoS Resilience at Scale — How ArzenLabs Designs Protection Beyond 200 TbpsDEV CommunityBacktrader vs VnPy vs Qlib: A Deep Comparison of Python Quant Backtesting Frameworks (2026)DEV CommunityWaaseyaa governance seriesDEV CommunityThe audit that started everything: how Waaseyaa designed an invariant-driven architectural reviewDEV Community

The Spectral Edge Thesis: A Mathematical Framework for Intra-Signal Phase Transitions in Neural Network Training

arXiv cs.LGby Yongzhong XuApril 1, 20262 min read0 views
Source Quiz

arXiv:2603.28964v1 Announce Type: new Abstract: We develop the spectral edge thesis: phase transitions in neural network training -- grokking, capability gains, loss plateaus -- are controlled by the spectral gap of the rolling-window Gram matrix of parameter updates. In the extreme aspect ratio regime (parameters $P \sim 10^8$, window $W \sim 10$), the classical BBP detection threshold is vacuous; the operative structure is the intra-signal gap separating dominant from subdominant modes at position $k^* = \mathrm{argmax}\, \sigma_j/\sigma_{j+1}$. From three axioms we derive: (i) gap dynamics governed by a Dyson-type ODE with curvature asymmetry, damping, and gradient driving; (ii) a spectral loss decomposition linking each mode's learning contribution to its Davis--Kahan stability coeffic

View PDF HTML (experimental)

Abstract:We develop the spectral edge thesis: phase transitions in neural network training -- grokking, capability gains, loss plateaus -- are controlled by the spectral gap of the rolling-window Gram matrix of parameter updates. In the extreme aspect ratio regime (parameters $P \sim 10^8$, window $W \sim 10$), the classical BBP detection threshold is vacuous; the operative structure is the intra-signal gap separating dominant from subdominant modes at position $k^* = \mathrm{argmax}, \sigma_j/\sigma_{j+1}$. From three axioms we derive: (i) gap dynamics governed by a Dyson-type ODE with curvature asymmetry, damping, and gradient driving; (ii) a spectral loss decomposition linking each mode's learning contribution to its Davis--Kahan stability coefficient; (iii) the Gap Maximality Principle, showing that $k^$ is the unique dynamically privileged position -- its collapse is the only one that disrupts learning, and it sustains itself through an $\alpha$-feedback loop requiring no assumption on the optimizer. The adiabatic parameter $\mathcal{A} = |\Delta G|_F / (\eta, g^2)$ controls circuit stability: $\mathcal{A} \ll 1$ (plateau), $\mathcal{A} \sim 1$ (phase transition), $\mathcal{A} \gg 1$ (forgetting). Tested across six model families (150K--124M parameters): gap dynamics precede every grokking event (24/24 with weight decay, 0/24 without), the gap position is optimizer-dependent (Muon: $k^=1$, AdamW: $k^*=2$ on the same model), and 19/20 quantitative predictions are confirmed. The framework is consistent with the edge of stability, Tensor Programs, Dyson Brownian motion, the Lottery Ticket Hypothesis, and neural scaling laws.

Comments: 60 pages, 5 figures

Subjects:

Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.28964 [cs.LG]

(or arXiv:2603.28964v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.28964

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Yongzhong Xu [view email] [v1] Mon, 30 Mar 2026 20:10:22 UTC (1,002 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelneural networktraining

Knowledge Map

Knowledge Map
TopicsEntitiesSource
The Spectra…modelneural netw…trainingannounceupdatepredictionarXiv cs.LG

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 200 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models