Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessNvidia Stock Rises. This Issue Could Hamper Its Next-Generation AI Chips. - Barron'sGNews AI NVIDIABroadcom's CEO Has Line of Sight to $100 Billion in AI Chip Revenue. Is the Stock a Buy? - The Motley FoolGoogle News: AII gave Claude Code our entire codebase. Our customers noticed. | Al Chen (Galileo)lennysnewsletter.comGoogle DeepMind and Agile Robotics Combine Robotics Platforms - Automation WorldGoogle News: DeepMindBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AIBuilding a Resume & Portfolio Platform with Next.js and ReactDev.to AIWhy AI-Powered Ecommerce Website Development Is the New Competitive Edge in 2026Dev.to AIFAQs on Visionary AI: Transforming the Future of InnovationDev.to AIDid AMD Just Beat Nvidia In AI Performance? - ForbesGNews AI NVIDIANvidia and Google are the safest AI bets in public markets: Intelligent Alpha CEO Doug Clinton - CNBCGNews AI NVIDIAOnly 20% of MCP Servers Are 'A-Grade' Secure — Here's How to Vet Them Before InstallingDev.to AIThe Senior Engineer's Guide to CLAUDE.md: From Generic to ActionableDev.to AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessNvidia Stock Rises. This Issue Could Hamper Its Next-Generation AI Chips. - Barron'sGNews AI NVIDIABroadcom's CEO Has Line of Sight to $100 Billion in AI Chip Revenue. Is the Stock a Buy? - The Motley FoolGoogle News: AII gave Claude Code our entire codebase. Our customers noticed. | Al Chen (Galileo)lennysnewsletter.comGoogle DeepMind and Agile Robotics Combine Robotics Platforms - Automation WorldGoogle News: DeepMindBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AIBuilding a Resume & Portfolio Platform with Next.js and ReactDev.to AIWhy AI-Powered Ecommerce Website Development Is the New Competitive Edge in 2026Dev.to AIFAQs on Visionary AI: Transforming the Future of InnovationDev.to AIDid AMD Just Beat Nvidia In AI Performance? - ForbesGNews AI NVIDIANvidia and Google are the safest AI bets in public markets: Intelligent Alpha CEO Doug Clinton - CNBCGNews AI NVIDIAOnly 20% of MCP Servers Are 'A-Grade' Secure — Here's How to Vet Them Before InstallingDev.to AIThe Senior Engineer's Guide to CLAUDE.md: From Generic to ActionableDev.to AI
AI NEWS HUBbyEIGENVECTOREigenvector

Adversarial Attacks on Multimodal Large Language Models: A Comprehensive Survey

arXivby [Submitted on 30 Mar 2026]March 31, 20262 min read1 views
Source Quiz

arXiv:2603.27918v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) integrate information from multiple modalities such as text, images, audio, and video, enabling complex capabilities such as visual question answering and audio translation. While powerful, this increased expressiveness introduces new and amplified vulnerabilities to adversarial manipulation. This survey provides a comprehensive and systematic analysis of adversarial threats to MLLMs, moving beyond enumerating attack techniques to explain the underlying causes of model susceptibility. We introduce a taxo — Bhavuk Jain, Sercan \"O. Ar{\i}k, Hardeo K. Thakur

View PDF HTML (experimental)

Abstract:Multimodal large language models (MLLMs) integrate information from multiple modalities such as text, images, audio, and video, enabling complex capabilities such as visual question answering and audio translation. While powerful, this increased expressiveness introduces new and amplified vulnerabilities to adversarial manipulation. This survey provides a comprehensive and systematic analysis of adversarial threats to MLLMs, moving beyond enumerating attack techniques to explain the underlying causes of model susceptibility. We introduce a taxonomy that organizes adversarial attacks according to attacker objectives, unifying diverse attack surfaces across modalities and deployment settings. Additionally, we also present a vulnerability-centric analysis that links integrity attacks, safety and jailbreak failures, control and instruction hijacking, and training-time poisoning to shared architectural and representational weaknesses in multimodal systems. Together, this framework provides an explanatory foundation for understanding adversarial behavior in MLLMs and informs the development of more robust and secure multimodal language systems.

Comments: Survey paper, 37 pages, 10 figures, accepted at TMLR

Subjects:

Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.27918 [cs.CR]

(or arXiv:2603.27918v1 [cs.CR] for this version)

https://doi.org/10.48550/arXiv.2603.27918

arXiv-issued DOI via DataCite (pending registration)

Journal reference: Transactions on Machine Learning Research, 2026

Submission history

From: Bhavuk Jain [view email] [v1] Mon, 30 Mar 2026 00:16:31 UTC (4,951 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Adversarial…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 204 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers