Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessIntroduction to Computer Music [pdf]Hacker NewsAI Desktop 98 lets you chat with Claude, ChatGPT, and Gemini through a Windows 98-inspired interface - XDAGoogle News: ChatGPTHow to secure MCP tools on AWS for AI agents with authentication, authorization, and least privilegeDev.to AIOpen Source Project of the Day (Part 30): banana-slides - Native AI PPT Generation App Based on nano banana proDev.to AIStop Writing AI Prompts From Scratch: A Developer's System for Reusable Prompt TemplatesDev.to AII Tested Every 'Memory' Solution for AI Coding Assistants - Here's What Actually WorksDev.to AIThe Flat Subscription Problem: Why Agents Break AI PricingDev.to AI10 Things I Wish I Knew Before Becoming an AI AgentDev.to AIGemma 4 Complete Guide: Architecture, Models, and Deployment in 2026Dev.to AI135,000 OpenClaw Users Just Got a 50x Price Hike. Anthropic Says It's 'Unsustainable.'Dev.to AIОдин промпт заменил мне 3 часа дебага в деньDev.to AIBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessIntroduction to Computer Music [pdf]Hacker NewsAI Desktop 98 lets you chat with Claude, ChatGPT, and Gemini through a Windows 98-inspired interface - XDAGoogle News: ChatGPTHow to secure MCP tools on AWS for AI agents with authentication, authorization, and least privilegeDev.to AIOpen Source Project of the Day (Part 30): banana-slides - Native AI PPT Generation App Based on nano banana proDev.to AIStop Writing AI Prompts From Scratch: A Developer's System for Reusable Prompt TemplatesDev.to AII Tested Every 'Memory' Solution for AI Coding Assistants - Here's What Actually WorksDev.to AIThe Flat Subscription Problem: Why Agents Break AI PricingDev.to AI10 Things I Wish I Knew Before Becoming an AI AgentDev.to AIGemma 4 Complete Guide: Architecture, Models, and Deployment in 2026Dev.to AI135,000 OpenClaw Users Just Got a 50x Price Hike. Anthropic Says It's 'Unsustainable.'Dev.to AIОдин промпт заменил мне 3 часа дебага в деньDev.to AIBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AI
AI NEWS HUBbyEIGENVECTOREigenvector

Omni-Modal Dissonance Benchmark: Systematically Breaking Modality Consensus to Probe Robustness and Calibrated Abstention

arXivby [Submitted on 28 Mar 2026]March 31, 20262 min read1 views
Source Quiz

arXiv:2603.27187v1 Announce Type: new Abstract: Existing omni-modal benchmarks attempt to measure modality-specific contributions, but their measurements are confounded: naturally co-occurring modalities carry correlated yet unequal information, making it unclear whether results reflect true modality reliance or information asymmetry. We introduce OMD-Bench, where all modalities are initially congruent - each presenting the same anchor, an object or event independently perceivable through video, audio, and text - which we then systematically corrupt to isolate each modality's contribution. We — Zabir Al Nazi, Shubhashis Roy Dipta, Md Rizwan Parvez

View PDF

Abstract:Existing omni-modal benchmarks attempt to measure modality-specific contributions, but their measurements are confounded: naturally co-occurring modalities carry correlated yet unequal information, making it unclear whether results reflect true modality reliance or information asymmetry. We introduce OMD-Bench, where all modalities are initially congruent - each presenting the same anchor, an object or event independently perceivable through video, audio, and text - which we then systematically corrupt to isolate each modality's contribution. We also evaluate calibrated abstention: whether models appropriately refrain from answering when evidence is conflicting. The benchmark comprises 4,080 instances spanning 27 anchors across eight corruption conditions. Evaluating ten omni-modal models under zero-shot and chain-of-thought prompting, we find that models over-abstain when two modalities are corrupted yet under-abstain severely when all three are, while maintaining high confidence (~60-100%) even under full corruption. Chain-of-thought prompting improves abstention alignment with human judgment but amplifies overconfidence rather than mitigating it. OMD-Bench provides a diagnostic benchmark for diagnosing modality reliance, robustness to cross-modal inconsistency, and uncertainty calibration in omni-modal systems.

Subjects:

Machine Learning (cs.LG)

Cite as: arXiv:2603.27187 [cs.LG]

(or arXiv:2603.27187v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.27187

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Zabir Al Nazi [view email] [v1] Sat, 28 Mar 2026 08:29:15 UTC (1,391 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Omni-Modal …researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 240 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers