Gradient Compression Beyond Low-Rank: Wavelet Subspaces Compact Optimizer States
arXiv:2501.07237v4 Announce Type: replace-cross Abstract: Large language models (LLMs) have shown impressive performance across a range of natural language processing tasks. However, their vast number of parameters introduces significant memory challenges during training, particularly when using memory-intensive optimizers like Adam. Existing memory-efficient algorithms often rely on techniques such as singular value decomposition projection or weight freezing. While these approaches help alleviate memory constraints, they generally produce suboptimal results compared to full-rank updates. In — Ziqing Wen, Ping Luo, Jiahuan Wang, Kun Yuan, Dongsheng Li, Tao Sun
View PDF HTML (experimental)
Abstract:Large language models (LLMs) have shown impressive performance across a range of natural language processing tasks. However, their vast number of parameters introduces significant memory challenges during training, particularly when using memory-intensive optimizers like Adam. Existing memory-efficient algorithms often rely on techniques such as singular value decomposition projection or weight freezing. While these approaches help alleviate memory constraints, they generally produce suboptimal results compared to full-rank updates. In this paper, we investigate the memory-efficient method beyond low-rank training, proposing a novel solution called Gradient Wavelet Transform (GWT), which applies wavelet transforms to gradients in order to significantly reduce the memory requirements for maintaining optimizer states. We demonstrate that GWT can be seamlessly integrated with memory-intensive optimizers, enabling efficient training while maintaining performance. Through extensive experiments on both pre-training and fine-tuning tasks, we show that GWT achieves performance comparable to advanced memory-efficient optimizers and full-rank approaches in terms of both memory usage and training performance.
Subjects:
Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as: arXiv:2501.07237 [cs.LG]
(or arXiv:2501.07237v4 [cs.LG] for this version)
https://doi.org/10.48550/arXiv.2501.07237
arXiv-issued DOI via DataCite
Submission history
From: Ziqing Wen [view email] [v1] Mon, 13 Jan 2025 11:35:09 UTC (322 KB) [v2] Tue, 29 Jul 2025 09:24:44 UTC (216 KB) [v3] Wed, 30 Jul 2025 01:07:39 UTC (216 KB) [v4] Mon, 30 Mar 2026 13:25:46 UTC (210 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivBuyer beware: how AI is infiltrating humanitarian aid operations
Access Now’s latest research unpacks how AI tools are dodging procurement vetting to infiltrate humanitarian organisations, creating new risks for them and the communities they serve. The post Buyer beware: how AI is infiltrating humanitarian aid operations appeared first on Access Now .
Artificial intelligence as a moral mediator: emotional reciprocity driving happiness in hospitality
Artificial intelligence (AI) in hospitality is often portrayed as a cold, efficiency-focused tool, overlooking its potential to mediate emotional and ethical dynamics in the workplace. This study addresses the problem of how AI can ethically regulate emotional labor without dehumanizing work, and how emotional reciprocity contributes to workplace happiness. Using a quantitative, multigroup survey methodology, data were collected from 754 hospitality employees and 42 managers across hotels in Spain. Structural equation modeling examined the mediating role of AI-mediated emotional reciprocity (AI-MER) between emotional labor sustainability (ELS), shared prosperity (SP), human-centered leadership, and workplace happiness. Findings reveal that ELS is a foundational anchor enabling AI to mediat
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
Artificial intelligence as a moral mediator: emotional reciprocity driving happiness in hospitality
Artificial intelligence (AI) in hospitality is often portrayed as a cold, efficiency-focused tool, overlooking its potential to mediate emotional and ethical dynamics in the workplace. This study addresses the problem of how AI can ethically regulate emotional labor without dehumanizing work, and how emotional reciprocity contributes to workplace happiness. Using a quantitative, multigroup survey methodology, data were collected from 754 hospitality employees and 42 managers across hotels in Spain. Structural equation modeling examined the mediating role of AI-mediated emotional reciprocity (AI-MER) between emotional labor sustainability (ELS), shared prosperity (SP), human-centered leadership, and workplace happiness. Findings reveal that ELS is a foundational anchor enabling AI to mediat
Unpacking the message: visual cues to reduce bystander uncertainty about delivery drones in public spaces
As drones are deployed in public spaces for tasks such as package delivery, drones will encounter the public as bystanders passing by. The distinctive character of bystanders is that they are not the package recipients, so they lack prior information about the drone. Clear communication of drone intentions is essential to reduce uncertainty and improve public safety and trust. Limited research, however, has examined how a drone’s communication strategies affect bystanders. This online questionnaire study investigated how a drone’s visual cues affect bystanders' uncertainty about a drone’s intentions. Participants ( N = 150) viewed software simulated scenarios of drones delivering packages either by landing or by cable drop, each with or without visual interfaces (on-board lights, on-board

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!