Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessDySCo: Dynamic Semantic Compression for Effective Long-term Time Series ForecastingarXivUQ-SHRED: uncertainty quantification of shallow recurrent decoder networks for sparse sensing via engressionarXivAn Online Machine Learning Multi-resolution Optimization Framework for Energy System Design Limit of Performance AnalysisarXivMalliavin Calculus for Counterfactual Gradient Estimation in Adaptive Inverse Reinforcement LearningarXivEfficient and Principled Scientific Discovery through Bayesian Optimization: A TutorialarXivMassively Parallel Exact Inference for Hawkes ProcessesarXivModel Merging via Data-Free Covariance EstimationarXivDetecting Complex Money Laundering Patterns with Incremental and Distributed Graph ModelingarXivForecasting Supply Chain Disruptions with Foresight LearningarXivSven: Singular Value Descent as a Computationally Efficient Natural Gradient MethodarXivSECURE: Stable Early Collision Understanding via Robust Embeddings in Autonomous DrivingarXivJetPrism: diagnosing convergence for generative simulation and inverse problems in nuclear physicsarXivBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessDySCo: Dynamic Semantic Compression for Effective Long-term Time Series ForecastingarXivUQ-SHRED: uncertainty quantification of shallow recurrent decoder networks for sparse sensing via engressionarXivAn Online Machine Learning Multi-resolution Optimization Framework for Energy System Design Limit of Performance AnalysisarXivMalliavin Calculus for Counterfactual Gradient Estimation in Adaptive Inverse Reinforcement LearningarXivEfficient and Principled Scientific Discovery through Bayesian Optimization: A TutorialarXivMassively Parallel Exact Inference for Hawkes ProcessesarXivModel Merging via Data-Free Covariance EstimationarXivDetecting Complex Money Laundering Patterns with Incremental and Distributed Graph ModelingarXivForecasting Supply Chain Disruptions with Foresight LearningarXivSven: Singular Value Descent as a Computationally Efficient Natural Gradient MethodarXivSECURE: Stable Early Collision Understanding via Robust Embeddings in Autonomous DrivingarXivJetPrism: diagnosing convergence for generative simulation and inverse problems in nuclear physicsarXiv
AI NEWS HUBbyEIGENVECTOREigenvector

OneComp: One-Line Revolution for Generative AI Model Compression

arXiv cs.LGby Yuma Ichikawa, Keiji Kimura, Akihiro Yoshida, Yudai Fujimoto, Hiroki Tokura, Yamato Arai, Yoshiyuki Ishii, Yusei Kawakami, Genki Shikada, Achille Jacquemond, Yoshihiko Fujisawa, Katsuki Fujisawa, Takumi Honda, Akira SakaiApril 1, 20261 min read0 views
Source Quiz

arXiv:2603.28845v1 Announce Type: new Abstract: Deploying foundation models is increasingly constrained by memory footprint, latency, and hardware costs. Post-training compression can mitigate these bottlenecks by reducing the precision of model parameters without significantly degrading performance; however, its practical implementation remains challenging as practitioners navigate a fragmented landscape of quantization algorithms, precision budgets, data-driven calibration strategies, and hardware-dependent execution regimes. We present OneComp, an open-source compression framework that transforms this expert workflow into a reproducible, resource-adaptive pipeline. Given a model identifier and available hardware, OneComp automatically inspects the model, plans mixed-precision assignment

Authors:Yuma Ichikawa, Keiji Kimura, Akihiro Yoshida, Yudai Fujimoto, Hiroki Tokura, Yamato Arai, Yoshiyuki Ishii, Yusei Kawakami, Genki Shikada, Achille Jacquemond, Yoshihiko Fujisawa, Katsuki Fujisawa, Takumi Honda, Akira Sakai

View PDF HTML (experimental)

Abstract:Deploying foundation models is increasingly constrained by memory footprint, latency, and hardware costs. Post-training compression can mitigate these bottlenecks by reducing the precision of model parameters without significantly degrading performance; however, its practical implementation remains challenging as practitioners navigate a fragmented landscape of quantization algorithms, precision budgets, data-driven calibration strategies, and hardware-dependent execution regimes. We present OneComp, an open-source compression framework that transforms this expert workflow into a reproducible, resource-adaptive pipeline. Given a model identifier and available hardware, OneComp automatically inspects the model, plans mixed-precision assignments, and executes progressive quantization stages, ranging from layer-wise compression to block-wise refinement and global refinement. A key architectural choice is treating the first quantized checkpoint as a deployable pivot, ensuring that each subsequent stage improves the same model and that quality increases as more compute is invested. By converting state-of-the-art compression research into an extensible, open-source, hardware-aware pipeline, OneComp bridges the gap between algorithmic innovation and production-grade model deployment.

Comments: 31 pages, 6 figures

Subjects:

Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Computation and Language (cs.CL)

Cite as: arXiv:2603.28845 [cs.LG]

(or arXiv:2603.28845v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.28845

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Yuma Ichikawa [view email] [v1] Mon, 30 Mar 2026 17:43:32 UTC (323 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelfoundation modeltraining

Knowledge Map

Knowledge Map
TopicsEntitiesSource
OneComp: On…modelfoundation …trainingannounceavailableopen-sourcearXiv cs.LG

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 326 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models