The Spectral Edge Thesis: A Mathematical Framework for Intra-Signal Phase Transitions in Neural Network Training
arXiv:2603.28964v1 Announce Type: new Abstract: We develop the spectral edge thesis: phase transitions in neural network training -- grokking, capability gains, loss plateaus -- are controlled by the spectral gap of the rolling-window Gram matrix of parameter updates. In the extreme aspect ratio regime (parameters $P \sim 10^8$, window $W \sim 10$), the classical BBP detection threshold is vacuous; the operative structure is the intra-signal gap separating dominant from subdominant modes at position $k^* = \mathrm{argmax}\, \sigma_j/\sigma_{j+1}$. From three axioms we derive: (i) gap dynamics governed by a Dyson-type ODE with curvature asymmetry, damping, and gradient driving; (ii) a spectral loss decomposition linking each mode's learning contribution to its Davis--Kahan stability coeffic
View PDF HTML (experimental)
Abstract:We develop the spectral edge thesis: phase transitions in neural network training -- grokking, capability gains, loss plateaus -- are controlled by the spectral gap of the rolling-window Gram matrix of parameter updates. In the extreme aspect ratio regime (parameters $P \sim 10^8$, window $W \sim 10$), the classical BBP detection threshold is vacuous; the operative structure is the intra-signal gap separating dominant from subdominant modes at position $k^* = \mathrm{argmax}, \sigma_j/\sigma_{j+1}$. From three axioms we derive: (i) gap dynamics governed by a Dyson-type ODE with curvature asymmetry, damping, and gradient driving; (ii) a spectral loss decomposition linking each mode's learning contribution to its Davis--Kahan stability coefficient; (iii) the Gap Maximality Principle, showing that $k^$ is the unique dynamically privileged position -- its collapse is the only one that disrupts learning, and it sustains itself through an $\alpha$-feedback loop requiring no assumption on the optimizer. The adiabatic parameter $\mathcal{A} = |\Delta G|_F / (\eta, g^2)$ controls circuit stability: $\mathcal{A} \ll 1$ (plateau), $\mathcal{A} \sim 1$ (phase transition), $\mathcal{A} \gg 1$ (forgetting). Tested across six model families (150K--124M parameters): gap dynamics precede every grokking event (24/24 with weight decay, 0/24 without), the gap position is optimizer-dependent (Muon: $k^=1$, AdamW: $k^*=2$ on the same model), and 19/20 quantitative predictions are confirmed. The framework is consistent with the edge of stability, Tensor Programs, Dyson Brownian motion, the Lottery Ticket Hypothesis, and neural scaling laws.
Comments: 60 pages, 5 figures
Subjects:
Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as: arXiv:2603.28964 [cs.LG]
(or arXiv:2603.28964v1 [cs.LG] for this version)
https://doi.org/10.48550/arXiv.2603.28964
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Yongzhong Xu [view email] [v1] Mon, 30 Mar 2026 20:10:22 UTC (1,002 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelneural networktrainingTop AI Agent Frameworks in 2026: A Production-Ready Comparison
We tested 8 AI agent frameworks in production across healthcare, logistics, and fintech. Here’s what actually works — and what breaks when real users show up. Six months ago, picking an AI agent framework meant choosing between LangGraph, CrewAI, and AutoGen. That was the entire conversation. Now every major AI lab ships its own agent SDK. OpenAI, Anthropic, and Google all launched agent development kits in 2026. Microsoft rebuilt AutoGen from scratch. LangGraph hit 126,000 GitHub stars. CrewAI raised funding and shipped enterprise features. The result: 120+ agentic AI tools across 11 categories, and every CTO we talk to asks the same question — which one should we actually build on? Here’s the problem with most framework comparisons: they benchmark toy examples. They test “build a researc

Your AI Writes Code. Who Fixes the Build?
<p>Every AI coding tool in 2026 can write code. Some of them write great code. But here's the question nobody asks during the demo: <strong>what happens when the build fails?</strong></p> <p>Because the build will fail. It always does.</p> <h2> The Invisible 40% </h2> <p>When you watch a demo of an AI coding tool, you see the impressive part: the AI generates a full component, a complete function, an entire page. It looks magical.</p> <p>What you don't see is what happens next:</p> <ul> <li>The import path is wrong because the AI didn't read the project's module structure</li> <li>There's a type mismatch because the API response shape changed last week</li> <li>A dependency is missing because the AI assumed it was already installed</li> <li>A CSS class doesn't exist because the AI used Tai

Claude AI Source Code Leaked: Individual Rewriting in Rust to Address Security Concerns
<h2> Introduction & Background </h2> <p>In a turn of events that feels ripped from the pages of a tech thriller, the <strong>source code of Claude</strong>, Anthropic’s advanced AI model, has been <strong>accidentally leaked</strong>. Compounding the intrigue, an individual has taken it upon themselves to <strong>rewrite the codebase in Rust</strong>, a programming language celebrated for its <strong>memory safety</strong> and <strong>performance</strong>. This incident isn’t just a footnote in AI history—it’s a glaring spotlight on the <strong>systemic vulnerabilities</strong> in AI security and intellectual property protection. The stakes? Nothing short of the <strong>future of AI development</strong>, the <strong>competitive landscape of tech giants</strong>, and the <strong>public’
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Sources: Amazon's "sponsored prompts" for its Rufus AI assistant are yielding significantly lower traffic than traditional ads, but are more cost-effective (Catherine Perloff/The Information)
Catherine Perloff / The Information : Sources: Amazon's “sponsored prompts” for its Rufus AI assistant are yielding significantly lower traffic than traditional ads, but are more cost-effective — The early batch of ads running on OpenAI's ChatGPT has drawn a lot of attention in recent weeks.

Waaseyaa governance series
<p>Ahnii!</p> <p>This series covers how <a href="https://github.com/waaseyaa/framework" rel="noopener noreferrer">Waaseyaa</a> — a PHP framework monorepo of 52 packages — went from accumulated architectural drift to a governed, verifiable implementation platform.</p> <h3> 1. <a href="https://jonesrussell.github.io/blog/waaseyaa-governance-audit/" rel="noopener noreferrer">The audit that started everything</a> </h3> <p>What architectural drift looks like in a 52-package PHP monorepo, how the invariant-driven M1 audit was designed with frozen vocabularies before the first finding was written, what it found across five concern passes, and how M2 turned 36 findings into a dependency-ordered eight-milestone program.</p> <h3> 2. Eight milestones, one chain </h3> <p>How the remediation program ra
Google adds new Gemini features to Docs, Sheets, Slides, and Drive - Mashable
<a href="https://news.google.com/rss/articles/CBMijwFBVV95cUxNNXhlbTU5d2NUTnBHSERUczAyNVV0S3BVTUZubDVoRFhvU2dQUWp2TElwdDF2WDAzN1BfMmxXRGE0dnRtSHpRczJ3UVRIbzVBNm9UUHRTTEp4T0FORzhDTGRKa2swVXVXSWpETzI5NFVfWWUwTHNXTzh1RkExUXpzMkppLV81djBmV1hDLWkyWQ?oc=5" target="_blank">Google adds new Gemini features to Docs, Sheets, Slides, and Drive</a> <font color="#6f6f6f">Mashable</font>

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!