Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessPeppa Pig and Transformers owner Hasbro hit by cyber-attackBBC TechnologyA New York Times reporter went to South Dakota to report on Kristi Noem's husband. Then the story broke.Business InsiderCognichip wants AI to design the chips that power AI, and just raised $60M to tryTechCrunch AISpaceX has reportedly filed for the biggest IPO in historyEngadgetThe Trump administration’s antitrust honeymoon is overThe Verge AIAnthropic vs OpenAI: An Investor ShiftBloomberg TechnologyApple turns 50: 8 of the company’s biggest tech milestonesSilicon RepublicSpaceX files confidentially for IPO in mega listing potentially valued at $1.75 trillion, report saysTechCrunch AII Built an AI Agent That Can Write Its Own Tools When It Gets StuckDEV CommunityBuilding a "Soft Sensor" for Cement Kilns: Predicting Control Levers with PythonDEV CommunityWe Traced One Query Through Perplexity’s Entire Stack in Cohort – Here’s What Actually Happens in 3 SecondsDEV CommunityAgent Self-Discovery: How AI Agents Find Their Own WalletsDEV CommunityBlack Hat USADark ReadingBlack Hat AsiaAI BusinessPeppa Pig and Transformers owner Hasbro hit by cyber-attackBBC TechnologyA New York Times reporter went to South Dakota to report on Kristi Noem's husband. Then the story broke.Business InsiderCognichip wants AI to design the chips that power AI, and just raised $60M to tryTechCrunch AISpaceX has reportedly filed for the biggest IPO in historyEngadgetThe Trump administration’s antitrust honeymoon is overThe Verge AIAnthropic vs OpenAI: An Investor ShiftBloomberg TechnologyApple turns 50: 8 of the company’s biggest tech milestonesSilicon RepublicSpaceX files confidentially for IPO in mega listing potentially valued at $1.75 trillion, report saysTechCrunch AII Built an AI Agent That Can Write Its Own Tools When It Gets StuckDEV CommunityBuilding a "Soft Sensor" for Cement Kilns: Predicting Control Levers with PythonDEV CommunityWe Traced One Query Through Perplexity’s Entire Stack in Cohort – Here’s What Actually Happens in 3 SecondsDEV CommunityAgent Self-Discovery: How AI Agents Find Their Own WalletsDEV Community

Learning to See through Illumination Extremes with Event Streaming in Multimodal Large Language Models

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2603.27558v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) perform strong vision-language reasoning under standard conditions but fail in extreme illumination, where RGB inputs lose irrevocable structure and semantics. We propose Event-MLLM, an event-enhanced model that performs all-light visual reasoning by dynamically fusing event streams with RGB frames. Two key components drive our approach: an Illumination Indicator - a learnable signal derived from a DINOv2 branch that represents exposure degradation and adaptively modulates event-RGB fusion - and an Illumin — Baoheng Zhang, Jiahui Liu, Gui Zhao, Weizhou Zhang, Yixuan Ma, Jun Jiang, Yingxian Chen, Wilton W. T. Fok, Xiaojuan Qi, Hayden Kwok-Hay So

View PDF HTML (experimental)

Abstract:Multimodal Large Language Models (MLLMs) perform strong vision-language reasoning under standard conditions but fail in extreme illumination, where RGB inputs lose irrevocable structure and semantics. We propose Event-MLLM, an event-enhanced model that performs all-light visual reasoning by dynamically fusing event streams with RGB frames. Two key components drive our approach: an Illumination Indicator - a learnable signal derived from a DINOv2 branch that represents exposure degradation and adaptively modulates event-RGB fusion - and an Illumination Correction Loss that aligns fused features with non-degraded (normal-light) semantics in the latent space, compensating for information lost in extreme lighting. We curate the first multi-illumination event-instruction corpus for MLLMs, with 2,241 event-RGB samples (around 6 QA pairs each) across diverse scenes and 17 brightness rates (0.05x - 20x), plus an instruct-following benchmark for reasoning, counting, and fine-grained recognition under extreme lighting. Experiments show that Event-MLLM markedly outperforms general-purpose, illumination-adaptive, and event-only baselines, setting a new state of the art in robust multimodal perception and reasoning under challenging illumination.

Comments: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.27558 [cs.CV]

(or arXiv:2603.27558v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.27558

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Jiahui Liu [view email] [v1] Sun, 29 Mar 2026 07:46:32 UTC (1,879 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Learning to…researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 155 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers