Research Papers research paper arxiv machine-learning deep-learning

Lipschitz verification of neural networks through training

arXivMarch 31, 202610 min read0 views

arXiv:2603.28113v1 Announce Type: new Abstract: The global Lipschitz constant of a neural network governs both adversarial robustness and generalization. Conventional approaches to ``certified training" typically follow a train-then-verify paradigm: they train a network and then attempt to bound its Lipschitz constant. Because the efficient ``trivial bound" (the product of the layerwise Lipschitz constants) is exponentially loose for arbitrary networks, these approaches must rely on computationally expensive techniques such as semidefinite programming, mixed-integer programming, or branch-and- — Simon Kuang, Yuezhu Xu, S. Sivaranjani, Xinfan Lin

View PDF HTML (experimental)

Abstract:The global Lipschitz constant of a neural network governs both adversarial robustness and generalization. Conventional approaches to
certified training" typically follow a train-then-verify paradigm: they train a network and then attempt to bound its Lipschitz constant. Because the efficient 
certified training" typically follow a train-then-verify paradigm: they train a network and then attempt to bound its Lipschitz constant. Because the efficient 
trivial bound" (the product of the layerwise Lipschitz constants) is exponentially loose for arbitrary networks, these approaches must rely on computationally expensive techniques such as semidefinite programming, mixed-integer programming, or branch-and-bound. We propose a different paradigm: rather than designing complex verifiers for arbitrary networks, we design networks to be verifiable by the fast trivial bound. We show that directly penalizing the trivial bound during training forces it to become tight, thereby effectively regularizing the true Lipschitz constant. To achieve this, we identify three structural obstructions to a tight trivial bound (dead neurons, bias terms, and ill-conditioned weights) and introduce architectural mitigations, including a novel notion of norm-saturating polyactivations and bias-free sinusoidal layers. Our approach avoids the runtime complexity of advanced verification while achieving strong results: we train robust networks on MNIST with Lipschitz bounds that are small (orders of magnitude lower than comparable works) and tight (within 10% of the ground truth). The experimental results validate the theoretical guarantees, support the proposed mechanisms, and extend empirically to diverse activations and non-Euclidean norms.

Subjects:

Machine Learning (cs.LG)

Cite as: arXiv:2603.28113 [cs.LG]

(or arXiv:2603.28113v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.28113

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Simon Kuang [view email] [v1] Mon, 30 Mar 2026 07:20:50 UTC (1,849 KB)

Original source

arXiv

https://arxiv.org/abs/2603.28113

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Models

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models WSJ

Google News: LLM

1m2 days ago

Models

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models WSJ

Google News: LLM

1m2 days ago

ModelsLive

Google launches Gemma 4: four open-weight models from smartphones to workstations

Built from the same research as Gemini 3, the new family spans a 2B edge model that runs on a Raspberry Pi to a 31B dense model currently ranked third on the Arena AI open-model leaderboard. The Apache 2.0 licence is a significant shift from previous Gemma releases. Google has released Gemma 4, the latest [ ] This story continues at The Next Web