Models model neural network training announce update prediction

The Spectral Edge Thesis: A Mathematical Framework for Intra-Signal Phase Transitions in Neural Network Training

arXiv cs.LGby Yongzhong XuApril 1, 20262 min read0 views

arXiv:2603.28964v1 Announce Type: new Abstract: We develop the spectral edge thesis: phase transitions in neural network training -- grokking, capability gains, loss plateaus -- are controlled by the spectral gap of the rolling-window Gram matrix of parameter updates. In the extreme aspect ratio regime (parameters $P \sim 10^8$, window $W \sim 10$), the classical BBP detection threshold is vacuous; the operative structure is the intra-signal gap separating dominant from subdominant modes at position $k^* = \mathrm{argmax}\, \sigma_j/\sigma_{j+1}$. From three axioms we derive: (i) gap dynamics governed by a Dyson-type ODE with curvature asymmetry, damping, and gradient driving; (ii) a spectral loss decomposition linking each mode's learning contribution to its Davis--Kahan stability coeffic

View PDF HTML (experimental)

Abstract:We develop the spectral edge thesis: phase transitions in neural network training -- grokking, capability gains, loss plateaus -- are controlled by the spectral gap of the rolling-window Gram matrix of parameter updates. In the extreme aspect ratio regime (parameters $P \sim 10^8$, window $W \sim 10$), the classical BBP detection threshold is vacuous; the operative structure is the intra-signal gap separating dominant from subdominant modes at position $k^* = \mathrm{argmax}, \sigma_j/\sigma_{j+1}$. From three axioms we derive: (i) gap dynamics governed by a Dyson-type ODE with curvature asymmetry, damping, and gradient driving; (ii) a spectral loss decomposition linking each mode's learning contribution to its Davis--Kahan stability coefficient; (iii) the Gap Maximality Principle, showing that $k^$ is the unique dynamically privileged position -- its collapse is the only one that disrupts learning, and it sustains itself through an $\alpha$-feedback loop requiring no assumption on the optimizer. The adiabatic parameter $\mathcal{A} = |\Delta G|_F / (\eta, g^2)$ controls circuit stability: $\mathcal{A} \ll 1$ (plateau), $\mathcal{A} \sim 1$ (phase transition), $\mathcal{A} \gg 1$ (forgetting). Tested across six model families (150K--124M parameters): gap dynamics precede every grokking event (24/24 with weight decay, 0/24 without), the gap position is optimizer-dependent (Muon: $k^=1$, AdamW: $k^*=2$ on the same model), and 19/20 quantitative predictions are confirmed. The framework is consistent with the edge of stability, Tensor Programs, Dyson Brownian motion, the Lottery Ticket Hypothesis, and neural scaling laws.

Comments: 60 pages, 5 figures

Subjects:

Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.28964 [cs.LG]

(or arXiv:2603.28964v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.28964

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Yongzhong Xu [view email] [v1] Mon, 30 Mar 2026 20:10:22 UTC (1,002 KB)

Original source

arXiv cs.LG

https://arxiv.org/abs/2603.28964

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelneural networktraining

ProductsLive

Top AI Agent Frameworks in 2026: A Production-Ready Comparison

We tested 8 AI agent frameworks in production across healthcare, logistics, and fintech. Here’s what actually works — and what breaks when real users show up. Six months ago, picking an AI agent framework meant choosing between LangGraph, CrewAI, and AutoGen. That was the entire conversation. Now every major AI lab ships its own agent SDK. OpenAI, Anthropic, and Google all launched agent development kits in 2026. Microsoft rebuilt AutoGen from scratch. LangGraph hit 126,000 GitHub stars. CrewAI raised funding and shipped enterprise features. The result: 120+ agentic AI tools across 11 categories, and every CTO we talk to asks the same question — which one should we actually build on? Here’s the problem with most framework comparisons: they benchmark toy examples. They test “build a researc

Towards AI

23m39 minutes ago

ProductsLive

Your AI Writes Code. Who Fixes the Build?

<p>Every AI coding tool in 2026 can write code. Some of them write great code. But here's the question nobody asks during the demo: <strong>what happens when the build fails?</strong></p> <p>Because the build will fail. It always does.</p> <h2> The Invisible 40% </h2> <p>When you watch a demo of an AI coding tool, you see the impressive part: the AI generates a full component, a complete function, an entire page. It looks magical.</p> <p>What you don't see is what happens next:</p> <ul> <li>The import path is wrong because the AI didn't read the project's module structure</li> <li>There's a type mismatch because the API response shape changed last week</li> <li>A dependency is missing because the AI assumed it was already installed</li> <li>A CSS class doesn't exist because the AI used Tai

DEV Community

4m30 minutes ago

Laws & RegulationLive

Claude AI Source Code Leaked: Individual Rewriting in Rust to Address Security Concerns

<h2> Introduction & Background </h2> <p>In a turn of events that feels ripped from the pages of a tech thriller, the <strong>source code of Claude</strong>, Anthropic’s advanced AI model, has been <strong>accidentally leaked</strong>. Compounding the intrigue, an individual has taken it upon themselves to <strong>rewrite the codebase in Rust</strong>, a programming language celebrated for its <strong>memory safety</strong> and <strong>performance</strong>. This incident isn’t just a footnote in AI history—it’s a glaring spotlight on the <strong>systemic vulnerabilities</strong> in AI security and intellectual property protection. The stakes? Nothing short of the <strong>future of AI development</strong>, the <strong>competitive landscape of tech giants</strong>, and the <strong>public’

DEV Community

10m31 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 200 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsLive

Sources: Amazon's "sponsored prompts" for its Rufus AI assistant are yielding significantly lower traffic than traditional ads, but are more cost-effective (Catherine Perloff/The Information)

Catherine Perloff / The Information : Sources: Amazon's “sponsored prompts” for its Rufus AI assistant are yielding significantly lower traffic than traditional ads, but are more cost-effective — The early batch of ads running on OpenAI's ChatGPT has drawn a lot of attention in recent weeks.

Techmeme

2m10 minutes ago

ModelsLive

‘BLOCKADE’: The Right Is Using AI Content Scanners to Try to Supercharge Book Banning

Groups that challenge books have begun using Gemini, ChatGPT, xAI, and other AI tools to try to get books banned.

404 Media

10m40 minutes ago

ModelsLive

Waaseyaa governance series

<p>Ahnii!</p> <p>This series covers how <a href="https://github.com/waaseyaa/framework" rel="noopener noreferrer">Waaseyaa</a> — a PHP framework monorepo of 52 packages — went from accumulated architectural drift to a governed, verifiable implementation platform.</p> <h3> 1. <a href="https://jonesrussell.github.io/blog/waaseyaa-governance-audit/" rel="noopener noreferrer">The audit that started everything</a> </h3> <p>What architectural drift looks like in a 52-package PHP monorepo, how the invariant-driven M1 audit was designed with frozen vocabularies before the first finding was written, what it found across five concern passes, and how M2 turned 36 findings into a dependency-ordered eight-milestone program.</p> <h3> 2. Eight milestones, one chain </h3> <p>How the remediation program ra

DEV Community

1m27 minutes ago

Models

Google adds new Gemini features to Docs, Sheets, Slides, and Drive - Mashable

<a href="https://news.google.com/rss/articles/CBMijwFBVV95cUxNNXhlbTU5d2NUTnBHSERUczAyNVV0S3BVTUZubDVoRFhvU2dQUWp2TElwdDF2WDAzN1BfMmxXRGE0dnRtSHpRczJ3UVRIbzVBNm9UUHRTTEp4T0FORzhDTGRKa2swVXVXSWpETzI5NFVfWWUwTHNXTzh1RkExUXpzMkppLV81djBmV1hDLWkyWQ?oc=5" target="_blank">Google adds new Gemini features to Docs, Sheets, Slides, and Drive</a> <font color="#6f6f6f">Mashable</font>

Google News: Gemini

1m22 days ago

The Spectral Edge Thesis: A Mathematical Framework for Intra-Signal Phase Transitions in Neural Network Training

Submission history

Daily AI Digest

More about

Top AI Agent Frameworks in 2026: A Production-Ready Comparison

Your AI Writes Code. Who Fixes the Build?

Claude AI Source Code Leaked: Individual Rewriting in Rust to Address Security Concerns

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Models

Sources: Amazon&apos;s "sponsored prompts" for its Rufus AI assistant are yielding significantly lower traffic than traditional ads, but are more cost-effective (Catherine Perloff/The Information)

‘BLOCKADE’: The Right Is Using AI Content Scanners to Try to Supercharge Book Banning

Waaseyaa governance series

Google adds new Gemini features to Docs, Sheets, Slides, and Drive - Mashable

Sources: Amazon's "sponsored prompts" for its Rufus AI assistant are yielding significantly lower traffic than traditional ads, but are more cost-effective (Catherine Perloff/The Information)