Research Papers research paper arxiv computer-vision image-recognition

OddGridBench: Exposing the Lack of Fine-Grained Visual Discrepancy Sensitivity in Multimodal Large Language Models

arXivMarch 31, 20262 min read0 views

arXiv:2603.09326v2 Announce Type: replace Abstract: Multimodal large language models (MLLMs) have achieved remarkable performance across a wide range of vision language tasks. However, their ability in low-level visual perception, particularly in detecting fine-grained visual discrepancies, remains underexplored and lacks systematic analysis. In this work, we introduce OddGridBench, a controllable benchmark for evaluating the visual discrepancy sensitivity of MLLMs. OddGridBench comprises over 1,400 grid-based images, where a single element differs from all others by one or multiple visual att — Tengjin Weng, Wenhao Jiang, Jingyi Wang, Ming Li, Lin Ma, Zhong Ming

View PDF HTML (experimental)

Abstract:Multimodal large language models (MLLMs) have achieved remarkable performance across a wide range of vision language tasks. However, their ability in low-level visual perception, particularly in detecting fine-grained visual discrepancies, remains underexplored and lacks systematic analysis. In this work, we introduce OddGridBench, a controllable benchmark for evaluating the visual discrepancy sensitivity of MLLMs. OddGridBench comprises over 1,400 grid-based images, where a single element differs from all others by one or multiple visual attributes such as color, size, rotation, or position. Experiments reveal that all evaluated MLLMs, including open-source families such as Qwen3-VL and InternVL3.5, and proprietary systems like Gemini-2.5-Pro and GPT-5, perform far below human levels in visual discrepancy detection. We further propose OddGrid-GRPO, a reinforcement learning framework that integrates curriculum learning and distance-aware reward. By progressively controlling the difficulty of training samples and incorporating spatial proximity constraints into the reward design, OddGrid-GRPO significantly enhances the model's fine-grained visual discrimination ability. We hope OddGridBench and OddGrid-GRPO will lay the groundwork for advancing perceptual grounding and visual discrepancy sensitivity in multimodal intelligence. Code and dataset are available at this https URL.

Comments: accepted by CVPR 2026

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.09326 [cs.CV]

(or arXiv:2603.09326v2 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.09326

arXiv-issued DOI via DataCite

Submission history

From: Tengjin Weng [view email] [v1] Tue, 10 Mar 2026 08:01:30 UTC (7,152 KB) [v2] Mon, 30 Mar 2026 12:07:08 UTC (7,147 KB)

Original source

arXiv

https://arxiv.org/abs/2603.09326

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Research Papers

Exclusive | OpenAI’s Former Research Chief Aims to Automate Manufacturing With AI - WSJ

Exclusive | OpenAI’s Former Research Chief Aims to Automate Manufacturing With AI WSJ

GNews AI manufacturing

1m30 days ago

Research PapersLive

Elon University Research Warns Greatest AI Risk is 'Superstupidity' - govtech.com

Elon University Research Warns Greatest AI Risk is 'Superstupidity' govtech.com

GNews AI education

1mabout 2 hours ago

Countries

From climate storytelling to AI innovation: Rice researchers take on global challenges at SXSW - Rice University

From climate storytelling to AI innovation: Rice researchers take on global challenges at SXSW Rice University

GNews AI climate

1m16 days ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 175 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research Papers

Exclusive | OpenAI’s Former Research Chief Aims to Automate Manufacturing With AI - WSJ

Exclusive | OpenAI’s Former Research Chief Aims to Automate Manufacturing With AI WSJ

GNews AI manufacturing

1m30 days ago

Research PapersLive

Elon University Research Warns Greatest AI Risk is 'Superstupidity' - govtech.com

Elon University Research Warns Greatest AI Risk is 'Superstupidity' govtech.com

GNews AI education

1mabout 2 hours ago

Research PapersLive

🔮 Autoresearch and the experimental society - exponentialview.co

🔮 Autoresearch and the experimental society exponentialview.co

Google News: Machine Learning

1mabout 1 hour ago

Research PapersLive

Springing into AI: PyTorch Conference Europe and ICLR 2026

Article URL: https://www.collabora.com/news-and-blog/news-and-events/springing-into-ai-pytorch-conference-europe-and-iclr-2026.html Comments URL: https://news.ycombinator.com/item?id=47619120 Points: 2 # Comments: 0

Hacker News AI Top

1mabout 1 hour ago