Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessCommunity Without Tokens: What AI Dev Tools Can Learn from Crypto's Community PlaybookDev.to AIGarry Tan's gstack: Install This 56k-Star 'Virtual Team' for Claude CodeDev.to AIA Step-by-Step Guide to K-Nearest Neighbors (KNN) in Machine LearningDev.to AIOil prices extend gains after record monthly rally as Iran war fuels supply worriesCNBC TechnologyWhy Your "AI Assistant" is Obsolete: Welcoming the Era of Agentic Workflows & MCPDev.to AIBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AIHow to Create Viral Videos with AI in 2026Dev.to AIEmbers of Autoregression: Understanding Large Language Models Through theProblem They are Trained to SolveDev.to AIBuilding the Payment Gateway for AI Agents: A Technical Deep DiveDev.to AIOpenClaw is incredible until you deploy it wrongDev.to AIWhy Most Frontend Apps Are Smarter Than Their Engineers RealizeDev.to AIThis Isn’t Another ‘AI Productivity Hack’ ArticleMedium AIBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessCommunity Without Tokens: What AI Dev Tools Can Learn from Crypto's Community PlaybookDev.to AIGarry Tan's gstack: Install This 56k-Star 'Virtual Team' for Claude CodeDev.to AIA Step-by-Step Guide to K-Nearest Neighbors (KNN) in Machine LearningDev.to AIOil prices extend gains after record monthly rally as Iran war fuels supply worriesCNBC TechnologyWhy Your "AI Assistant" is Obsolete: Welcoming the Era of Agentic Workflows & MCPDev.to AIBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AIHow to Create Viral Videos with AI in 2026Dev.to AIEmbers of Autoregression: Understanding Large Language Models Through theProblem They are Trained to SolveDev.to AIBuilding the Payment Gateway for AI Agents: A Technical Deep DiveDev.to AIOpenClaw is incredible until you deploy it wrongDev.to AIWhy Most Frontend Apps Are Smarter Than Their Engineers RealizeDev.to AIThis Isn’t Another ‘AI Productivity Hack’ ArticleMedium AI

MAN++: Scaling Momentum Auxiliary Network for Supervised Local Learning in Vision Tasks

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2507.16279v2 Announce Type: replace Abstract: Deep learning typically relies on end-to-end backpropagation for training, a method that inherently suffers from issues such as update locking during parameter optimization, high GPU memory consumption, and a lack of biological plausibility. In contrast, supervised local learning seeks to mitigate these challenges by partitioning the network into multiple local blocks and designing independent auxiliary networks to update each block separately. However, because gradients are propagated solely within individual local blocks, performance degrad — Junhao Su, Feiyu Zhu, Hengyu Shi, Tianyang Han, Yurui Qiu, Junfeng Luo, Xiaoming Wei, Jialin Gao

View PDF HTML (experimental)

Abstract:Deep learning typically relies on end-to-end backpropagation for training, a method that inherently suffers from issues such as update locking during parameter optimization, high GPU memory consumption, and a lack of biological plausibility. In contrast, supervised local learning seeks to mitigate these challenges by partitioning the network into multiple local blocks and designing independent auxiliary networks to update each block separately. However, because gradients are propagated solely within individual local blocks, performance degradation occurs, preventing supervised local learning from supplanting end-to-end backpropagation. To address these limitations and facilitate inter-block information flow, we propose the Momentum Auxiliary Network++ (MAN++). MAN++ introduces a dynamic interaction mechanism by employing the Exponential Moving Average (EMA) of parameters from adjacent blocks to enhance communication across the network. The auxiliary network, updated via EMA, effectively bridges the information gap between blocks. Notably, we observed that directly applying EMA parameters can be suboptimal due to feature discrepancies between local blocks. To resolve this issue, we introduce a learnable scaling bias that balances feature differences, thereby further improving performance. We validate MAN++ through extensive experiments on tasks that include image classification, object detection, and image segmentation, utilizing multiple network architectures. The experimental results demonstrate that MAN++ achieves performance comparable to end-to-end training while significantly reducing GPU memory usage. Consequently, MAN++ offers a novel perspective for supervised local learning and presents a viable alternative to conventional training methods.

Comments: Accepted by TPAMI

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2507.16279 [cs.CV]

(or arXiv:2507.16279v2 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2507.16279

arXiv-issued DOI via DataCite

Submission history

From: Junhao Su [view email] [v1] Tue, 22 Jul 2025 06:50:19 UTC (15,588 KB) [v2] Sat, 28 Mar 2026 12:37:48 UTC (19,177 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
MAN++: Scal…researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 97 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers