Research Papers research paper arxiv ai artificial-intelligence

Shape and Substance: Dual-Layer Side-Channel Attacks on Local Vision-Language Models

arXivby [Submitted on 26 Mar 2026 (this version), latest version 27 Mar 2026 (v2)]March 26, 20262 min read1 views

On-device Vision-Language Models (VLMs) promise data privacy via local execution. However, we show that the architectural shift toward Dynamic High-Resolution preprocessing (e.g., AnyRes) introduces an inherent algorithmic side-channel. Unlike static models, dynamic preprocessing decomposes images into a variable number of patches based on their aspect ratio, creating workload-dependent inputs. We demonstrate a dual-layer attack framework against local VLMs. In Tier 1, an unprivileged attacker can exploit significant execution-time variations using standard unprivileged OS metrics to reliably — Eyal Hadad, Mordechai Guri

View PDF HTML (experimental)

Abstract:On-device Vision-Language Models (VLMs) promise data privacy via local execution. However, we show that the architectural shift toward Dynamic High-Resolution preprocessing (e.g., AnyRes) introduces an inherent algorithmic side-channel. Unlike static models, dynamic preprocessing decomposes images into a variable number of patches based on their aspect ratio, creating workload-dependent inputs. We demonstrate a dual-layer attack framework against local VLMs. In Tier 1, an unprivileged attacker can exploit significant execution-time variations using standard unprivileged OS metrics to reliably fingerprint the input's geometry. In Tier 2, by profiling Last-Level Cache (LLC) contention, the attacker can resolve semantic ambiguity within identical geometries, distinguishing between visually dense (e.g., medical X-rays) and sparse (e.g., text documents) content. By evaluating state-of-the-art models such as LLaVA-NeXT and Qwen2-VL, we show that combining these signals enables reliable inference of privacy-sensitive contexts. Finally, we analyze the security engineering trade-offs of mitigating this vulnerability, reveal substantial performance overhead with constant-work padding, and propose practical design recommendations for secure Edge AI deployments.

Comments: 13 pages, 8 figures

Subjects:

Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Cite as: arXiv:2603.25403 [cs.CR]

(or arXiv:2603.25403v1 [cs.CR] for this version)

https://doi.org/10.48550/arXiv.2603.25403

arXiv-issued DOI via DataCite

Submission history

From: Eyal Hadad [view email] [v1] Thu, 26 Mar 2026 12:53:49 UTC (5,693 KB) [v2] Fri, 27 Mar 2026 15:01:28 UTC (5,694 KB)

Original source

arXiv

https://arxiv.org/abs/2603.25403v1

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Frontier Research

Will Reinforcement Learning Get Us to AGI? This Anthropic Researcher Thinks So - The Information

Will Reinforcement Learning Get Us to AGI? This Anthropic Researcher Thinks So The Information

GNews AI reinforcement learning

1m6 months ago

Models

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models WSJ

Google News: LLM

1m3 days ago

Research PapersLive

Seeking arXiv cs.AI endorsement — neuroscience-inspired memory architecture for AI agents

Hi everyone, I’m an independent researcher (Zensation AI) seeking endorsement for my first arXiv submission in cs.AI. Paper: “ZenBrain: A Neuroscience-Inspired 7-Layer Memory Architecture for Autonomous AI Systems” Summary: ZenBrain is the first AI memory system grounded in cognitive neuroscience. It implements 7 memory layers (working, short-term, episodic, semantic, procedural, core, cross-context) with 12 algorithms including Hebbian learning, FSRS spaced repetition, sleep-time consolidation (Stickgold & Walker 2013), and Bayesian confidence propagation. Prior art: Published as defensive publication on TDCommons (dpubs_series/9683) and archived on Zenodo (DOI: 10.5281/zenodo.19353663). Open-source npm packages with 9,000+ tests. Why this matters: Recent surveys (arxiv:2603.07670) identi

discuss.huggingface.co

1mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 177 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

Shape and Substance: Dual-Layer Side-Channel Attacks on Local Vision-Language Models

Submission history

Daily AI Digest

More about

Will Reinforcement Learning Get Us to AGI? This Anthropic Researcher Thinks So - The Information

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

Seeking arXiv cs.AI endorsement — neuroscience-inspired memory architecture for AI agents

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Research Papers

Seeking arXiv cs.AI endorsement — neuroscience-inspired memory architecture for AI agents

TTA establishes AI security standards group to address emerging risks - telecompaper.com

Exclusive | OpenAI’s Former Research Chief Aims to Automate Manufacturing With AI - WSJ

Tech bills of the week: quantum computing research; AI workforce development; and more - Nextgov/FCW