Research Papers research paper arxiv machine-learning deep-learning

The Geometric Cost of Normalization: Affine Bounds on the Bayesian Complexity of Neural Networks

arXivMarch 31, 202610 min read0 views

arXiv:2603.27432v1 Announce Type: new Abstract: LayerNorm and RMSNorm impose fundamentally different geometric constraints on their outputs - and this difference has a precise, quantifiable consequence for model complexity. We prove that LayerNorm's mean-centering step, by confining data to a linear hyperplane (through the origin), reduces the Local Learning Coefficient (LLC) of the subsequent weight matrix by exactly $m/2$ (where $m$ is its output dimension); RMSNorm's projection onto a sphere preserves the LLC entirely. This reduction is structurally guaranteed before any training begins, de — Sungbae Chun

View PDF HTML (experimental)

Abstract:LayerNorm and RMSNorm impose fundamentally different geometric constraints on their outputs - and this difference has a precise, quantifiable consequence for model complexity. We prove that LayerNorm's mean-centering step, by confining data to a linear hyperplane (through the origin), reduces the Local Learning Coefficient (LLC) of the subsequent weight matrix by exactly $m/2$ (where $m$ is its output dimension); RMSNorm's projection onto a sphere preserves the LLC entirely. This reduction is structurally guaranteed before any training begins, determined by data manifold geometry alone. The underlying condition is a geometric threshold: for the codimension-one manifolds we study, the LLC drop is binary -- any non-zero curvature, regardless of sign or magnitude, is sufficient to preserve the LLC, while only affinely flat manifolds cause the drop. At finite sample sizes this threshold acquires a smooth crossover whose width depends on how much of the data distribution actually experiences the curvature, not merely on whether curvature exists somewhere. We verify both predictions experimentally with controlled single-layer scaling experiments using the wrLLC framework. We further show that Softmax simplex data introduces a "smuggled bias" that activates the same $m/2$ LLC drop when paired with an explicit downstream bias, proved via the affine symmetry extension of the main theorem and confirmed empirically.

Comments: 12 pages, 2 figures

Subjects:

Machine Learning (cs.LG); Information Theory (cs.IT)

Cite as: arXiv:2603.27432 [cs.LG]

(or arXiv:2603.27432v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.27432

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Sungbae Chun [view email] [v1] Sat, 28 Mar 2026 22:15:45 UTC (46 KB)

Original source

arXiv

https://arxiv.org/abs/2603.27432

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

ModelsRecent

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models WSJ

Google News: LLM

1m2 days ago

ModelsLive

🔮 Autoresearch and the experimental society

The most important thing happening in AI right now is not just the intelligence of the models, but the harnesses that make that intelligence usable.

Exponential View

1m30 minutes ago

Products

AI Regulation Insights

As Canada s trusted partner in AI advancement, Vector Institute continues to bridge cutting-edge research with practical industry applications through strategic initiatives. In response to the rapidly evolving AI regulatory landscape, [ ] The post AI Regulation Insights appeared first on Vector Institute for Artificial Intelligence .

Vector Institute

1mabout 1 year ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 165 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersLive

30 Unpublished Poems From Iconic Greek Philosopher Discovered in Cairo

For over two millennia, the philosopher’s poetic work had largely been alluded to in the writings of others.

Gizmodo

1mabout 1 hour ago

Research Papers

I was a beta tester for the Nobel prize-winning AlphaFold AI – it’s going to revolutionise health research - The Conversation

I was a beta tester for the Nobel prize-winning AlphaFold AI – it’s going to revolutionise health research The Conversation

GNews AI protein

1mover 1 year ago

Research PapersRecent

IBM Advances Quantum Computing Research: Will it Boost Prospects? - Yahoo Finance Singapore

IBM Advances Quantum Computing Research: Will it Boost Prospects? Yahoo Finance Singapore

GNews AI quantum

1m1 day ago

Research PapersFresh

Quantum computers might crack today's encryption far sooner than we thought

According to a study by engineers at Caltech and the UC Department of Physics, quantum computers do not need to be nearly as powerful as previously believed to crack the most advanced cryptographic technologies. The research claims that Shor's algorithm could break RSA public-key encryption using quantum computers with just... Read Entire Article

TechSpot

1mabout 5 hours ago