Research Papers research paper arxiv ai artificial-intelligence

Gradient Compression Beyond Low-Rank: Wavelet Subspaces Compact Optimizer States

arXivMarch 31, 202610 min read0 views

arXiv:2501.07237v4 Announce Type: replace-cross Abstract: Large language models (LLMs) have shown impressive performance across a range of natural language processing tasks. However, their vast number of parameters introduces significant memory challenges during training, particularly when using memory-intensive optimizers like Adam. Existing memory-efficient algorithms often rely on techniques such as singular value decomposition projection or weight freezing. While these approaches help alleviate memory constraints, they generally produce suboptimal results compared to full-rank updates. In — Ziqing Wen, Ping Luo, Jiahuan Wang, Kun Yuan, Dongsheng Li, Tao Sun

View PDF HTML (experimental)

Abstract:Large language models (LLMs) have shown impressive performance across a range of natural language processing tasks. However, their vast number of parameters introduces significant memory challenges during training, particularly when using memory-intensive optimizers like Adam. Existing memory-efficient algorithms often rely on techniques such as singular value decomposition projection or weight freezing. While these approaches help alleviate memory constraints, they generally produce suboptimal results compared to full-rank updates. In this paper, we investigate the memory-efficient method beyond low-rank training, proposing a novel solution called Gradient Wavelet Transform (GWT), which applies wavelet transforms to gradients in order to significantly reduce the memory requirements for maintaining optimizer states. We demonstrate that GWT can be seamlessly integrated with memory-intensive optimizers, enabling efficient training while maintaining performance. Through extensive experiments on both pre-training and fine-tuning tasks, we show that GWT achieves performance comparable to advanced memory-efficient optimizers and full-rank approaches in terms of both memory usage and training performance.

Subjects:

Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Cite as: arXiv:2501.07237 [cs.LG]

(or arXiv:2501.07237v4 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2501.07237

arXiv-issued DOI via DataCite

Submission history

From: Ziqing Wen [view email] [v1] Mon, 13 Jan 2025 11:35:09 UTC (322 KB) [v2] Tue, 29 Jul 2025 09:24:44 UTC (216 KB) [v3] Wed, 30 Jul 2025 01:07:39 UTC (216 KB) [v4] Mon, 30 Mar 2026 13:25:46 UTC (210 KB)

Original source

arXiv

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Products

Buyer beware: how AI is infiltrating humanitarian aid operations

Access Now’s latest research unpacks how AI tools are dodging procurement vetting to infiltrate humanitarian organisations, creating new risks for them and the communities they serve. The post Buyer beware: how AI is infiltrating humanitarian aid operations appeared first on Access Now .

Access Now

1m5 days ago

Research Papers

Artificial intelligence as a moral mediator: emotional reciprocity driving happiness in hospitality

Artificial intelligence (AI) in hospitality is often portrayed as a cold, efficiency-focused tool, overlooking its potential to mediate emotional and ethical dynamics in the workplace. This study addresses the problem of how AI can ethically regulate emotional labor without dehumanizing work, and how emotional reciprocity contributes to workplace happiness. Using a quantitative, multigroup survey methodology, data were collected from 754 hospitality employees and 42 managers across hotels in Spain. Structural equation modeling examined the mediating role of AI-mediated emotional reciprocity (AI-MER) between emotional labor sustainability (ELS), shared prosperity (SP), human-centered leadership, and workplace happiness. Findings reveal that ELS is a foundational anchor enabling AI to mediat

AI & Society Journal

1m8 days ago

Products

I Tested These 5 AI Deep Research Tools So You Don't Have To!

AI YouTube Channel 40

1m10 months ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 123 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research Papers

Artificial intelligence as a moral mediator: emotional reciprocity driving happiness in hospitality

AI & Society Journal

1m8 days ago

Research Papers

Unpacking the message: visual cues to reduce bystander uncertainty about delivery drones in public spaces

As drones are deployed in public spaces for tasks such as package delivery, drones will encounter the public as bystanders passing by. The distinctive character of bystanders is that they are not the package recipients, so they lack prior information about the drone. Clear communication of drone intentions is essential to reduce uncertainty and improve public safety and trust. Limited research, however, has examined how a drone’s communication strategies affect bystanders. This online questionnaire study investigated how a drone’s visual cues affect bystanders' uncertainty about a drone’s intentions. Participants ( N = 150) viewed software simulated scenarios of drones delivering packages either by landing or by cable drop, each with or without visual interfaces (on-board lights, on-board

AI & Society Journal

2m4 days ago

Research Papers

Edouard Harris - New Research: Advanced AI may tend to seek power by default

Edouard Harris - New Research: Advanced AI may tend to seek power *by default*

AI YouTube Channel 37

1mover 3 years ago

Research Papers

Drag Your GAN Explained: Image Editing via Drag & Drop Using AI | Paper Reimplementation | Tutorial

AI YouTube Channel 28

1malmost 3 years ago