Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessMicrosoft is automatically updating Windows 11 24H2 to 25H2 using machine learning - TweakTownGoogle News: Machine Learning80 Years to an Overnight Success: The Real History of Artificial Intelligence - Futurist SpeakerGoogle News: AIWhat next for the struggling rural mothers in China who helped to build AI?SCMP Tech (Asia AI)Apple reportedly signed a 3rd-party driver, by Tiny Corp, for AMD or Nvidia eGPUs for Apple Silicon Macs; it s meant for AI research, not accelerating graphics (AppleInsider)TechmemeBest Resume Builders in 2026: I Applied to 50 Jobs to Test TheseDEV CommunityTruth Technology and the Architecture of Digital TrustDEV CommunityI Switched From GitKraken to This Indie Git Client and I’m Not Going BackDEV CommunityWhy I Run 22 Docker Services at HomeDEV CommunityHow to Embed ChatGPT in Your Website: 5 Methods Compared [2026 Guide]DEV CommunityThe Spaceballs sequel will be released in April next yearEngadgetResearch across 1,372 participants and 9K+ trials details "cognitive surrender", where most subjects had minimal AI skepticism and accepted faulty AI reasoning (Kyle Orland/Ars Technica)TechmemeUnpacking Peter Thiel s big bet on solar-powered cow collarsTechCrunchBlack Hat USADark ReadingBlack Hat AsiaAI BusinessMicrosoft is automatically updating Windows 11 24H2 to 25H2 using machine learning - TweakTownGoogle News: Machine Learning80 Years to an Overnight Success: The Real History of Artificial Intelligence - Futurist SpeakerGoogle News: AIWhat next for the struggling rural mothers in China who helped to build AI?SCMP Tech (Asia AI)Apple reportedly signed a 3rd-party driver, by Tiny Corp, for AMD or Nvidia eGPUs for Apple Silicon Macs; it s meant for AI research, not accelerating graphics (AppleInsider)TechmemeBest Resume Builders in 2026: I Applied to 50 Jobs to Test TheseDEV CommunityTruth Technology and the Architecture of Digital TrustDEV CommunityI Switched From GitKraken to This Indie Git Client and I’m Not Going BackDEV CommunityWhy I Run 22 Docker Services at HomeDEV CommunityHow to Embed ChatGPT in Your Website: 5 Methods Compared [2026 Guide]DEV CommunityThe Spaceballs sequel will be released in April next yearEngadgetResearch across 1,372 participants and 9K+ trials details "cognitive surrender", where most subjects had minimal AI skepticism and accepted faulty AI reasoning (Kyle Orland/Ars Technica)TechmemeUnpacking Peter Thiel s big bet on solar-powered cow collarsTechCrunch
AI NEWS HUBbyEIGENVECTOREigenvector

Swift-SVD: Theoretical Optimality Meets Practical Efficiency in Low-Rank LLM Compression

arXiv cs.CLby [Submitted on 2 Apr 2026]April 4, 20262 min read1 views
Source Quiz

arXiv:2604.01609v1 Announce Type: new Abstract: The deployment of Large Language Models is constrained by the memory and bandwidth demands of static weights and dynamic Key-Value cache. SVD-based compression provides a hardware-friendly solution to reduce these costs. However, existing methods suffer from two key limitations: some are suboptimal in reconstruction error, while others are theoretically optimal but practically inefficient. In this paper, we propose Swift-SVD, an activation-aware, closed-form compression framework that simultaneously guarantees theoretical optimum, practical efficiency and numerical stability. Swift-SVD incrementally aggregates covariance of output activations given a batch of inputs and performs a single eigenvalue decomposition after aggregation, enabling tr

View PDF HTML (experimental)

Abstract:The deployment of Large Language Models is constrained by the memory and bandwidth demands of static weights and dynamic Key-Value cache. SVD-based compression provides a hardware-friendly solution to reduce these costs. However, existing methods suffer from two key limitations: some are suboptimal in reconstruction error, while others are theoretically optimal but practically inefficient. In this paper, we propose Swift-SVD, an activation-aware, closed-form compression framework that simultaneously guarantees theoretical optimum, practical efficiency and numerical stability. Swift-SVD incrementally aggregates covariance of output activations given a batch of inputs and performs a single eigenvalue decomposition after aggregation, enabling training-free, fast, and optimal layer-wise low-rank approximation. We employ effective rank to analyze local layer-wise compressibility and design a dynamic rank allocation strategy that jointly accounts for local reconstruction loss and end-to-end layer importance. Extensive experiments across six LLMs and eight datasets demonstrate that Swift-SVD outperforms state-of-the-art baselines, achieving optimal compression accuracy while delivering 3-70X speedups in end-to-end compression time. Our code will be released upon acceptance.

Comments: Under Review

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2604.01609 [cs.CL]

(or arXiv:2604.01609v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2604.01609

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Jian Chen [view email] [v1] Thu, 2 Apr 2026 04:40:50 UTC (613 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modeltraining

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Swift-SVD: …modellanguage mo…trainingreleaseannouncepaperarXiv cs.CL

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 203 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!