Research Papers research paper arxiv nlp language-models

OmniRAG-Agent: Agentic Omnimodal Reasoning for Low-Resource Long Audio-Video Question Answering

arXivMarch 31, 20262 min read0 views

arXiv:2602.03707v4 Announce Type: replace Abstract: Long-horizon omnimodal question answering answers questions by reasoning over text, images, audio, and video. Despite recent progress on OmniLLMs, low-resource long audio-video QA still suffers from costly dense encoding, weak fine-grained retrieval, limited proactive planning, and no clear end-to-end optimization. To address these issues, we propose OmniRAG-Agent, an agentic omnimodal QA method for budgeted long audio-video reasoning. It builds an image-audio retrieval-augmented generation module that lets an OmniLLM fetch short, relevant fr — Yifan Zhu, Xinyu Mu, Tao Feng, Zhonghong Ou, Yuning Gong, Haoran Luo

View PDF HTML (experimental)

Abstract:Long-horizon omnimodal question answering answers questions by reasoning over text, images, audio, and video. Despite recent progress on OmniLLMs, low-resource long audio-video QA still suffers from costly dense encoding, weak fine-grained retrieval, limited proactive planning, and no clear end-to-end optimization. To address these issues, we propose OmniRAG-Agent, an agentic omnimodal QA method for budgeted long audio-video reasoning. It builds an image-audio retrieval-augmented generation module that lets an OmniLLM fetch short, relevant frames and audio snippets from external banks. Moreover, it uses an agent loop that plans, calls tools across turns, and merges retrieved evidence to answer complex queries. Furthermore, we apply group relative policy optimization to jointly improve tool use and answer quality over time. Experiments on OmniVideoBench, WorldSense, and Daily-Omni show that OmniRAG-Agent consistently outperforms prior methods under low-resource settings and achieves strong results, with ablations validating each component.

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2602.03707 [cs.CL]

(or arXiv:2602.03707v4 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2602.03707

arXiv-issued DOI via DataCite

Submission history

From: Xinyu Mu [view email] [v1] Tue, 3 Feb 2026 16:28:24 UTC (15,724 KB) [v2] Wed, 4 Feb 2026 03:33:14 UTC (15,724 KB) [v3] Sun, 22 Feb 2026 15:44:32 UTC (15,724 KB) [v4] Mon, 30 Mar 2026 11:14:57 UTC (15,726 KB)

Original source

arXiv

https://arxiv.org/abs/2602.03707

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Research PapersRecent

Data centers are creating ‘heat islands’ on land around them – warming them by up to 16 degrees, researchers warn - The Independent

<a href="https://news.google.com/rss/articles/CBMiogFBVV95cUxQcVVnRFpzdEtnNVFmdll6VlViUUc5aUhkSzR4Wi1zOVNOMFo2TGtBcjZLR1ZnNVdmYUlPcDNrNW9oT3YzUFFSYlJjLUlLUmtQT1pWQzFxVWRnSXZjelJpaXoxTURrZGw0OFVMc2U5SGhyOVpEMnlnVmhrQ3R6VF9teFNPLTJ0c3JaNGJJeHRaR3ZmOGRFd0FMLVQ2ZHpTMm42NGc?oc=5" target="_blank">Data centers are creating ‘heat islands’ on land around them – warming them by up to 16 degrees, researchers warn</a> The Independent

GNews AI climate

1m1 day ago

ModelsRecent

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxQelg3M0U0azc4TENIb2NHX09Ea1AtczN5T3ptb0lBS0g0MXdsbjBYVWNTc3RmZU1pQm5USjI0WWRNZjhGaVRtdmhhU01qQ1ZnUEZQN3B3QVFxek5BeWdRLU5EeDlJSEw3blJIOGVTSDR2dVl2RHpFTmd1dEpYdElxbmFNM1UyTzAxTm1wQmJOTk10ZE80VFgxVGJYUGdTbXFCa041VVhvZmVHLWMxTDVHaDlFdE8tSjIzVTZLY2dpVzlYRUROZ1JLMUhscFluQU44Y3ZKbDN0ZHUyeGpVNU5aTGtSaF9pM0YwVG1sd3p6S0V6OVc0WGZPQk1qOGY2UU5MUkJ6MHA5SmlaLUtURU5tQzFXZ2hVSnRNTHM3UWl5QmxYRkJiNDJkd1VYUFBWeG1mZFNEb0JtQl9SWUFwTU9IVnlfZWVLeTRTU25IZDRJM1pVQ3F1eFRIV1o0NUVveW8xRjFzNVQyQkdFOU5xdFhqZ0F3S3VJMHNNZHBPVEE1eUpTVTA3QUp3WFZKMk9CeDJUVWwyOWZBUDJkelpOQl9laUQ2QjVYRW1iYUU3OW1LMkRMSDJWQlRKRw?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> WSJ

Google News: LLM

1m1 day ago

Products

Penn State Extension AI tool, Tilva, expands access to research-based guidance - The Pennsylvania State University

<a href="https://news.google.com/rss/articles/CBMiuAFBVV95cUxOX0prRHBaY0x3cnNKM3RnR3BuTmlBeW1xRE1wNFlMdEpZM2d2Uk9EMEM4MU5ONnkyNDVDbm9oYjVxRDNTZjZ2NzF3VUJvTWpsU2k2a1EtRDVaZjI5X3U2SEJraG4tN0JCLU4xaThNa2FtYnFZU0pSSkNkaGRYdEpaVlZYbXlmMUF4VzFkcHQtM052eE5sVG9wODA1dDRGUlNrWFRZenRmRU1DckNHNUg5blhCc0Jnby1Z?oc=5" target="_blank">Penn State Extension AI tool, Tilva, expands access to research-based guidance</a> The Pennsylvania State University

GNews AI agriculture

1m3 months ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 87 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersRecent

Data centers are creating ‘heat islands’ on land around them – warming them by up to 16 degrees, researchers warn - The Independent

GNews AI climate

1m1 day ago

Research PapersFresh

The Quantum Threat to Bitcoin Dividing Crypto

Two papers published this week have reignited debates about the risk posed by “Q-day” to the cryptography that underpins digital assets.

Decrypt AI

1mabout 3 hours ago

Research PapersFresh

Researchers to use robotics and AI to help sheep producers - University of Nevada, Reno

<a href="https://news.google.com/rss/articles/CBMic0FVX3lxTFB4UmxpREpFODBJN0lKakYwRVVtdlZPNmNiTExRelVFaDYzYW9kX2RCc0pEZjlmX01fT1dWYTlxZE1ET2ZKVVgzSVZIenY3bDlHa3FXS1dUdVBmTEdLa1hUR2x3OWxHbkE2RnROSjl6VHVHQ2c?oc=5" target="_blank">Researchers to use robotics and AI to help sheep producers</a> University of Nevada, Reno

Google News: AI

1mabout 4 hours ago

Research PapersFresh

AIRA_2: Breaking Bottlenecks In AI Research Agents - Forbes

<a href="https://news.google.com/rss/articles/CBMiowFBVV95cUxNNmtndHhmQ2lpZGdPdTJwY25xejcyV1c1SWNLdWFOWnNwbjRUQTF0ZWdOZFNaclNBNWVsaUgtU0JUM2xrakhoOXVLMVJzVTNkajdrMmJGeS1lYUpMUG1NMkZNMDJFREZZdXU2ZVdEbkNZSDNBRjJBLVYyZE9XeEY4T0RJY3J5aDVWcEZVQ2lWUjhUYXBsUk16d09NdGdsQ3lxb3gw?oc=5" target="_blank">AIRA_2: Breaking Bottlenecks In AI Research Agents</a> Forbes

Google News: Machine Learning

1mabout 3 hours ago