Live
Black Hat USAAI BusinessBlack Hat AsiaAI Business1 Artificial Intelligence (AI) Software Stock to Buy Hand Over Fist Before It Soars 62%, According to Dan Ives - The Motley FoolGoogle News: AIGroup Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAIGizmodoGroup Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAI - GizmodoGoogle News: OpenAIInside the race to recreate Claude Code and mine its guts for revelationsBusiness InsiderAnthropic Executive Sees Cowork Agent as Bigger Than Claude Code - Bloomberg.comGoogle News: ClaudeAnthropic Executive Sees Cowork Agent as Bigger Than Claude CodeBloomberg TechnologyABAP OOP Design Patterns — Part 2: Factory, Observer, and Decorator Patterns in Real SAP SystemsDEV CommunityWhy Your AI Agent Health Check Is Lying to YouDEV CommunityDeep Dive: Array Internals & Memory LayoutDEV CommunityIllinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!Google News: Machine LearningICE Tells Lawmakers It’s Using Spyware in Fight Against FentanylBloomberg TechnologyAmazon Facilities in Bahrain Hit Again as Iran Follows Through on Threat, Report SaysGizmodoBlack Hat USAAI BusinessBlack Hat AsiaAI Business1 Artificial Intelligence (AI) Software Stock to Buy Hand Over Fist Before It Soars 62%, According to Dan Ives - The Motley FoolGoogle News: AIGroup Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAIGizmodoGroup Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAI - GizmodoGoogle News: OpenAIInside the race to recreate Claude Code and mine its guts for revelationsBusiness InsiderAnthropic Executive Sees Cowork Agent as Bigger Than Claude Code - Bloomberg.comGoogle News: ClaudeAnthropic Executive Sees Cowork Agent as Bigger Than Claude CodeBloomberg TechnologyABAP OOP Design Patterns — Part 2: Factory, Observer, and Decorator Patterns in Real SAP SystemsDEV CommunityWhy Your AI Agent Health Check Is Lying to YouDEV CommunityDeep Dive: Array Internals & Memory LayoutDEV CommunityIllinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!Google News: Machine LearningICE Tells Lawmakers It’s Using Spyware in Fight Against FentanylBloomberg TechnologyAmazon Facilities in Bahrain Hit Again as Iran Follows Through on Threat, Report SaysGizmodo

Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2603.28651v1 Announce Type: new Abstract: With the rapid progress of multimodal large language models (MLLMs), AI already performs well at literature retrieval and certain reasoning tasks, serving as a capable assistant to human researchers, yet it remains far from autonomous research. The fundamental reason is that current work on academic paper reasoning is largely confined to a search-oriented paradigm centered on pre-specified targets, with reasoning grounded in relevance retrieval, which struggles to support researcher-style full-document understanding, reasoning, and verification. — Rongjin Li, Zichen Tang, Xianghe Wang, Xinyi Hu, Zhengyu Wang, Zhengyu Lu, Yiling Huang, Jiayuan Chen, Weisheng Tan, Jiacheng Liu, Zhongjun Yang, Haihong E

Authors:Rongjin Li, Zichen Tang, Xianghe Wang, Xinyi Hu, Zhengyu Wang, Zhengyu Lu, Yiling Huang, Jiayuan Chen, Weisheng Tan, Jiacheng Liu, Zhongjun Yang, Haihong E

View PDF HTML (experimental)

Abstract:With the rapid progress of multimodal large language models (MLLMs), AI already performs well at literature retrieval and certain reasoning tasks, serving as a capable assistant to human researchers, yet it remains far from autonomous research. The fundamental reason is that current work on academic paper reasoning is largely confined to a search-oriented paradigm centered on pre-specified targets, with reasoning grounded in relevance retrieval, which struggles to support researcher-style full-document understanding, reasoning, and verification. To bridge this gap, we propose \textbf{ScholScan}, a new benchmark for academic paper reasoning. ScholScan introduces a scan-oriented task setting that asks models to read and cross-check entire papers like human researchers, scanning the document to identify consistency issues. The benchmark comprises 1,800 carefully annotated questions drawn from nine error categories across 13 natural-science domains and 715 papers, and provides detailed annotations for evidence localization and reasoning traces, together with a unified evaluation protocol. We assessed 15 models across 24 input configurations and conducted a fine-grained analysis of MLLM capabilities for all error categories. Across the board, retrieval-augmented generation (RAG) methods yield no significant improvements, revealing systematic deficiencies of current MLLMs on scan-oriented tasks and underscoring the challenge posed by ScholScan. We expect ScholScan to be the leading and representative work of the scan-oriented task paradigm.

Comments: Accepted to ICLR 2026

Subjects:

Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.28651 [cs.AI]

(or arXiv:2603.28651v1 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2603.28651

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Rongjin Li [view email] [v1] Fri, 27 Mar 2026 15:58:23 UTC (39,424 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Not Search,…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 170 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers