Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessAI’s next frontier is the real worldFortune TechDebris from aerial interception strikes Oracle building in Dubai, UAE saysCNBC TechnologyI Audited 30+ Small Businesses on Their AI Visibility. Here's What Most Are Getting Wrong.Dev.to AIHow to Actually Monitor Your LLM Costs (Without a Spreadsheet)Dev.to AIОдин промпт приносит мне $500 в неделю на фрилансеDev.to AINetflix AI Team Just Open-Sourced VOID: an AI Model That Erases Objects From Videos — Physics and AllMarkTechPostUnderstanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained.DEV CommunityHow to Supercharge Your AI Coding Workflow with Oh My CodexDev.to AIThe 11 steps that run every time you press Enter in Claude CodeDev.to AIBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AIOptimizing Claude Code token usage: lessons learnedDEV CommunityAgents Bedrock AgentCore en mode VPC : attention aux coûts de NAT Gateway !DEV CommunityBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessAI’s next frontier is the real worldFortune TechDebris from aerial interception strikes Oracle building in Dubai, UAE saysCNBC TechnologyI Audited 30+ Small Businesses on Their AI Visibility. Here's What Most Are Getting Wrong.Dev.to AIHow to Actually Monitor Your LLM Costs (Without a Spreadsheet)Dev.to AIОдин промпт приносит мне $500 в неделю на фрилансеDev.to AINetflix AI Team Just Open-Sourced VOID: an AI Model That Erases Objects From Videos — Physics and AllMarkTechPostUnderstanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained.DEV CommunityHow to Supercharge Your AI Coding Workflow with Oh My CodexDev.to AIThe 11 steps that run every time you press Enter in Claude CodeDev.to AIBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AIOptimizing Claude Code token usage: lessons learnedDEV CommunityAgents Bedrock AgentCore en mode VPC : attention aux coûts de NAT Gateway !DEV Community
AI NEWS HUBbyEIGENVECTOREigenvector

M-RAG: Making RAG Faster, Stronger, and More Efficient

arXivby [Submitted on 6 Jan 2026]March 31, 20262 min read1 views
Source Quiz

arXiv:2603.26667v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) has become a widely adopted paradigm for enhancing the reliability of large language models (LLMs). However, RAG systems are sensitive to retrieval strategies that rely on text chunking to construct retrieval units, which often introduce information fragmentation, retrieval noise, and reduced efficiency. Recent work has even questioned the necessity of RAG, arguing that long-context LLMs may eliminate multi-stage retrieval pipelines by directly processing full documents. Nevertheless, expanded context capaci — Sun Xu, Tongkai Xu, Baiheng Xie, Li Huang, Qiang Gao, Kunpeng Zhang

View PDF HTML (experimental)

Abstract:Retrieval-Augmented Generation (RAG) has become a widely adopted paradigm for enhancing the reliability of large language models (LLMs). However, RAG systems are sensitive to retrieval strategies that rely on text chunking to construct retrieval units, which often introduce information fragmentation, retrieval noise, and reduced efficiency. Recent work has even questioned the necessity of RAG, arguing that long-context LLMs may eliminate multi-stage retrieval pipelines by directly processing full documents. Nevertheless, expanded context capacity alone does not resolve the challenges of relevance filtering, evidence prioritization, and isolating answer-bearing information. To this end, we proposed M-RAG, a novel Chunk-free retrieval strategy. Instead of retrieving coarse-grained textual chunks, M-RAG extracts structured, k-v decomposition meta-markers, with a lightweight, intent-aligned retrieval key for retrieval and a context-rich information value for generation. Under this setting, M-RAG enables efficient and stable query-key similarity matching without sacrificing expressive ability. Experimental results on the LongBench subtasks demonstrate that M-RAG outperforms chunk-based RAG baselines across varying token budgets, particularly under low-resource settings. Extensive analysis further reveals that M-RAG retrieves more answer-friendly evidence with high efficiency, validating the effectiveness of decoupling retrieval representation from generation and highlighting the proposed strategy as a scalable and robust alternative to existing chunk-based methods.

Subjects:

Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.26667 [cs.IR]

(or arXiv:2603.26667v1 [cs.IR] for this version)

https://doi.org/10.48550/arXiv.2603.26667

arXiv-issued DOI via DataCite

Submission history

From: Li Huang [view email] [v1] Tue, 6 Jan 2026 15:14:54 UTC (4,255 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
M-RAG: Maki…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 317 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers