Models model language model training announce available valuation

A Comprehensive Information-Decomposition Analysis of Large Vision-Language Models

arXiv cs.LGby Lixin Xiu, Xufang Luo, Hideki NakayamaApril 1, 20261 min read0 views

arXiv:2603.29676v1 Announce Type: new Abstract: Large vision-language models (LVLMs) achieve impressive performance, yet their internal decision-making processes remain opaque, making it difficult to determine if the success stems from true multimodal fusion or from reliance on unimodal priors. To address this attribution gap, we introduce a novel framework using partial information decomposition (PID) to quantitatively measure the "information spectrum" of LVLMs -- decomposing a model's decision-relevant information into redundant, unique, and synergistic components. By adapting a scalable estimator to modern LVLM outputs, our model-agnostic pipeline profiles 26 LVLMs on four datasets across three dimensions -- breadth (cross-model & cross-task), depth (layer-wise information dynamics), a

View PDF HTML (experimental)

Abstract:Large vision-language models (LVLMs) achieve impressive performance, yet their internal decision-making processes remain opaque, making it difficult to determine if the success stems from true multimodal fusion or from reliance on unimodal priors. To address this attribution gap, we introduce a novel framework using partial information decomposition (PID) to quantitatively measure the "information spectrum" of LVLMs -- decomposing a model's decision-relevant information into redundant, unique, and synergistic components. By adapting a scalable estimator to modern LVLM outputs, our model-agnostic pipeline profiles 26 LVLMs on four datasets across three dimensions -- breadth (cross-model & cross-task), depth (layer-wise information dynamics), and time (learning dynamics across training). Our analysis reveals two key results: (i) two task regimes (synergy-driven vs. knowledge-driven) and (ii) two stable, contrasting family-level strategies (fusion-centric vs. language-centric). We also uncover a consistent three-phase pattern in layer-wise processing and identify visual instruction tuning as the key stage where fusion is learned. Together, these contributions provide a quantitative lens beyond accuracy-only evaluation and offer insights for analyzing and designing the next generation of LVLMs. Code and data are available at this https URL .

Comments: Accepted at ICLR 2026. Project page: this https URL

Subjects:

Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.29676 [cs.LG]

(or arXiv:2603.29676v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.29676

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Lixin Xiu [view email] [v1] Tue, 31 Mar 2026 12:32:29 UTC (684 KB)

Original source

arXiv cs.LG

https://arxiv.org/abs/2603.29676

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modeltraining

ProductsLive

Choosing an AI Agent Orchestrator in 2026: A Practical Comparison

Running one AI coding agent is easy. Running three in parallel on the same codebase is where things get interesting — and where you need to make a tooling choice. There's no "best" orchestrator. There's the right one for your workflow. Here's an honest comparison of five approaches, with the tradeoffs I've seen after months of running multi-agent setups. The Options 1. Raw tmux Scripts What it is: Shell scripts that launch agents in tmux panes. DIY orchestration. Pros: Zero dependencies beyond tmux Full control over every detail No abstractions to fight You already know how it works Cons: No state management — you track everything manually No message routing between agents No test gating — agents declare "done" without verification Breaks when agents crash or hit context limits You become

Dev.to AI

6m29 minutes ago

ModelsLive

Functional Emotions in Large Language Models: What Anthropic Found Inside Claude

Based on: Sofroniew, Kauvar, Saunders, Chen et al., “Emotion Concepts and their Function in a Large Language Model,” Transformer Circuits… Continue reading on Medium »

Medium AI

1m15 minutes ago

ProductsLive

How AI Is Changing the Way We Build Online Businesses

Not long ago, building an online business meant: months of development hiring developers large upfront costs Today? AI has completely changed the game. Now, one person can go from idea → to revenue faster than ever before. And this shift is just getting started. ⚠️ The Old Way vs The New Way Before AI: Build everything from scratch Spend weeks on infrastructure Launch slowly Iterate even slower With AI: Build faster Automate key tasks Launch quickly Iterate in real time The difference is massive. 🧠 AI Is Reducing the Cost of Building One of the biggest changes: 👉 Building is no longer the bottleneck AI helps with: generating content writing code automating workflows handling repetitive tasks What used to take weeks… 👉 now takes days ⚙️ Infrastructure Is No Longer the Hard Part Another s

Dev.to AI

3m17 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 235 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

A Comprehensive Information-Decomposition Analysis of Large Vision-Language Models

Submission history

Daily AI Digest

More about

Choosing an AI Agent Orchestrator in 2026: A Practical Comparison

Functional Emotions in Large Language Models: What Anthropic Found Inside Claude

How AI Is Changing the Way We Build Online Businesses

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Models

The 10 Claude “Plugins” You Actually Need in 2026

Functional Emotions in Large Language Models: What Anthropic Found Inside Claude

Exclusive | Pentagon Used Anthropic’s Claude in Maduro Venezuela Raid - WSJ

Google 'Gemma 4' AI model: This new AI tool can build AI agents for you and handle text, image, audio tasks - MSN