Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessThis International Fact-Checking Day, use these 5 tips to spot AI-generated contentFast Company TechExclusive | OpenAI Buys Tech-Industry Talk Show TBPN - WSJGoogle News: OpenAIPrediction: The $700 Billion Artificial Intelligence (AI) Capex Boom Will Create the Best Buying Opportunity of 2026 for These 3 Stocks - The Motley FoolGoogle News: AIIndia AI Startup Sarvam Raises Funds at $1.5 Billion ValuationBloomberg TechnologyApple's AI Strategy Is Pivoting. Here's Why That Could Be Great News for the Stock. - The Motley FoolGNews AI AppleThere’s a Blinking Warning Sign for the Data Centers in Space IndustryFuturism AIThe Practical Guide to Superbabieslesswrong.com🔮 Autoresearch and the experimental societyExponential ViewCursor Launches a New AI Agent Experience to Take on Claude Code and Codex - WIREDGoogle News: ClaudeRamy Youssef Sets HBO Comedy Special ‘In Love’ Joking About Saudi Arabia’s Riyadh Comedy Festival, AI and More - VarietyGNews AI Saudi ArabiaAmerica’s AI chip rules keep changing — and the rest of the world is paying the pricetomshardware.comBlack Hat USADark ReadingBlack Hat AsiaAI BusinessThis International Fact-Checking Day, use these 5 tips to spot AI-generated contentFast Company TechExclusive | OpenAI Buys Tech-Industry Talk Show TBPN - WSJGoogle News: OpenAIPrediction: The $700 Billion Artificial Intelligence (AI) Capex Boom Will Create the Best Buying Opportunity of 2026 for These 3 Stocks - The Motley FoolGoogle News: AIIndia AI Startup Sarvam Raises Funds at $1.5 Billion ValuationBloomberg TechnologyApple's AI Strategy Is Pivoting. Here's Why That Could Be Great News for the Stock. - The Motley FoolGNews AI AppleThere’s a Blinking Warning Sign for the Data Centers in Space IndustryFuturism AIThe Practical Guide to Superbabieslesswrong.com🔮 Autoresearch and the experimental societyExponential ViewCursor Launches a New AI Agent Experience to Take on Claude Code and Codex - WIREDGoogle News: ClaudeRamy Youssef Sets HBO Comedy Special ‘In Love’ Joking About Saudi Arabia’s Riyadh Comedy Festival, AI and More - VarietyGNews AI Saudi ArabiaAmerica’s AI chip rules keep changing — and the rest of the world is paying the pricetomshardware.com
AI NEWS HUBbyEIGENVECTOREigenvector

The Geometric Cost of Normalization: Affine Bounds on the Bayesian Complexity of Neural Networks

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2603.27432v1 Announce Type: new Abstract: LayerNorm and RMSNorm impose fundamentally different geometric constraints on their outputs - and this difference has a precise, quantifiable consequence for model complexity. We prove that LayerNorm's mean-centering step, by confining data to a linear hyperplane (through the origin), reduces the Local Learning Coefficient (LLC) of the subsequent weight matrix by exactly $m/2$ (where $m$ is its output dimension); RMSNorm's projection onto a sphere preserves the LLC entirely. This reduction is structurally guaranteed before any training begins, de — Sungbae Chun

View PDF HTML (experimental)

Abstract:LayerNorm and RMSNorm impose fundamentally different geometric constraints on their outputs - and this difference has a precise, quantifiable consequence for model complexity. We prove that LayerNorm's mean-centering step, by confining data to a linear hyperplane (through the origin), reduces the Local Learning Coefficient (LLC) of the subsequent weight matrix by exactly $m/2$ (where $m$ is its output dimension); RMSNorm's projection onto a sphere preserves the LLC entirely. This reduction is structurally guaranteed before any training begins, determined by data manifold geometry alone. The underlying condition is a geometric threshold: for the codimension-one manifolds we study, the LLC drop is binary -- any non-zero curvature, regardless of sign or magnitude, is sufficient to preserve the LLC, while only affinely flat manifolds cause the drop. At finite sample sizes this threshold acquires a smooth crossover whose width depends on how much of the data distribution actually experiences the curvature, not merely on whether curvature exists somewhere. We verify both predictions experimentally with controlled single-layer scaling experiments using the wrLLC framework. We further show that Softmax simplex data introduces a "smuggled bias" that activates the same $m/2$ LLC drop when paired with an explicit downstream bias, proved via the affine symmetry extension of the main theorem and confirmed empirically.

Comments: 12 pages, 2 figures

Subjects:

Machine Learning (cs.LG); Information Theory (cs.IT)

Cite as: arXiv:2603.27432 [cs.LG]

(or arXiv:2603.27432v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.27432

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Sungbae Chun [view email] [v1] Sat, 28 Mar 2026 22:15:45 UTC (46 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
The Geometr…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 165 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!