Live
Black Hat USADark ReadingBlack Hat AsiaAI Business跳出幸存者偏差,从结构性资源分配解析财富真相Dev.to AIJapan's Sakura Internet jumps 20% as Microsoft plans $10 billion AI push with SoftBank - CNBCGNews AI JapanOpenClaw vs Cloud AI: Which One Actually Gives Businesses More Control?Medium AI“In a World of AI Content, Being Human Is Your Superpower”Medium AIHow AI is Transforming the Role of a CFO in 2026.Medium AIHow to Build Self-Running AI Tasks with TypeScript (No Cron Jobs Needed)Dev.to AIFaked Fire Drill!Medium AIMicrosoft To Invest $10 Bn For Japan AI Data Centres - Barron'sGNews AI Japanv4.3.1text-gen-webui ReleasesThe Sentinel: AI-Powered Zero-Touch Insurance for Gig WorkersDev.to AIDecision Trees from Data: Building Context-Aware ModelsDev.to AIFrom Crisis to Clinic: How AI Automates Drug Shortage ResolutionDev.to AIBlack Hat USADark ReadingBlack Hat AsiaAI Business跳出幸存者偏差,从结构性资源分配解析财富真相Dev.to AIJapan's Sakura Internet jumps 20% as Microsoft plans $10 billion AI push with SoftBank - CNBCGNews AI JapanOpenClaw vs Cloud AI: Which One Actually Gives Businesses More Control?Medium AI“In a World of AI Content, Being Human Is Your Superpower”Medium AIHow AI is Transforming the Role of a CFO in 2026.Medium AIHow to Build Self-Running AI Tasks with TypeScript (No Cron Jobs Needed)Dev.to AIFaked Fire Drill!Medium AIMicrosoft To Invest $10 Bn For Japan AI Data Centres - Barron'sGNews AI Japanv4.3.1text-gen-webui ReleasesThe Sentinel: AI-Powered Zero-Touch Insurance for Gig WorkersDev.to AIDecision Trees from Data: Building Context-Aware ModelsDev.to AIFrom Crisis to Clinic: How AI Automates Drug Shortage ResolutionDev.to AI
AI NEWS HUBbyEIGENVECTOREigenvector

ResAdapt: Adaptive Resolution for Efficient Multimodal Reasoning

HuggingFace PapersMarch 30, 20268 min read0 views
Source Quiz

ResAdapt is an input-side adaptation framework that dynamically allocates visual resources to improve multimodal large language models' efficiency in video tasks while maintaining high performance. (0 upvotes on HuggingFace)

Published on Mar 30

Authors:

,

,

,

,

,

,

Abstract

ResAdapt is an input-side adaptation framework that dynamically allocates visual resources to improve multimodal large language models' efficiency in video tasks while maintaining high performance.

AI-generated summary

Multimodal Large Language Models (MLLMs) achieve stronger visual understanding by scaling input fidelity, yet the resulting visual token growth makes jointly sustaining high spatial resolution and long temporal context prohibitive. We argue that the bottleneck lies not in how post-encoding representations are compressed but in the volume of pixels the encoder receives, and address it with ResAdapt, an Input-side adaptation framework that learns how much visual budget each frame should receive before encoding. ResAdapt couples a lightweight Allocator with an unchanged MLLM backbone, so the backbone retains its native visual-token interface while receiving an operator-transformed input. We formulate allocation as a contextual bandit and train the Allocator with Cost-Aware Policy Optimization (CAPO), which converts sparse rollout feedback into a stable accuracy-cost learning signal. Across budget-controlled video QA, temporal grounding, and image reasoning tasks, ResAdapt improves low-budget operating points and often lies on or near the efficiency-accuracy frontier, with the clearest gains on reasoning-intensive benchmarks under aggressive compression. Notably, ResAdapt supports up to 16x more frames at the same visual budget while delivering over 15% performance gain. Code is available at https://github.com/Xnhyacinth/ResAdapt.

View arXiv page View PDF Project page GitHub 6 Add to collection

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.28610 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.28610 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.28610 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
ResAdapt: A…researchpaperarxivMultimodal …visual toke…spatial res…HuggingFace…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 228 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers