Live
Black Hat USAAI BusinessBlack Hat AsiaAI Business'AI-pilled' engineers are working harder and burning out faster, Django co-creator saysBusiness InsiderK-pop has an AI problem - dazeddigital.comGoogle News: Generative AIOpenAI’s new ChatGPT base model ‘Spud’: All you need to know - Storyboard18Google News: ChatGPTGoogle DeepMind Launches Gemma 4 Amid Competition from Chinese Open Models - Analytics India MagazineGoogle News: DeepMindMicrosoft releases foundational AI models targeting enterprisesSilicon RepublicCan AI chatbots effectively support cancer patients during treatments? - ESMO Daily ReporterGoogle News: AIAccelerating drug discovery with “paradigm shifting” AI model - BioTechniquesGoogle News: Machine LearningStep by Step Guide to Build an End-to-End Model Optimization Pipeline with NVIDIA Model Optimizer Using FastNAS Pruning and Fine-TuningMarkTechPostSeeking arXiv cs.AI endorsement — neuroscience-inspired memory architecture for AI agentsdiscuss.huggingface.coGenerative AI: A Legal Framework in Development - group.bnpparibasGoogle News: Generative AIMicrosoft announces US$10B AI investment plan in Japan - MSNGNews AI USAS. Korea, France Bolster Ties in AI, Quantum Computing - KBS WORLD RadioGNews AI KoreaBlack Hat USAAI BusinessBlack Hat AsiaAI Business'AI-pilled' engineers are working harder and burning out faster, Django co-creator saysBusiness InsiderK-pop has an AI problem - dazeddigital.comGoogle News: Generative AIOpenAI’s new ChatGPT base model ‘Spud’: All you need to know - Storyboard18Google News: ChatGPTGoogle DeepMind Launches Gemma 4 Amid Competition from Chinese Open Models - Analytics India MagazineGoogle News: DeepMindMicrosoft releases foundational AI models targeting enterprisesSilicon RepublicCan AI chatbots effectively support cancer patients during treatments? - ESMO Daily ReporterGoogle News: AIAccelerating drug discovery with “paradigm shifting” AI model - BioTechniquesGoogle News: Machine LearningStep by Step Guide to Build an End-to-End Model Optimization Pipeline with NVIDIA Model Optimizer Using FastNAS Pruning and Fine-TuningMarkTechPostSeeking arXiv cs.AI endorsement — neuroscience-inspired memory architecture for AI agentsdiscuss.huggingface.coGenerative AI: A Legal Framework in Development - group.bnpparibasGoogle News: Generative AIMicrosoft announces US$10B AI investment plan in Japan - MSNGNews AI USAS. Korea, France Bolster Ties in AI, Quantum Computing - KBS WORLD RadioGNews AI Korea
AI NEWS HUBbyEIGENVECTOREigenvector

Shape and Substance: Dual-Layer Side-Channel Attacks on Local Vision-Language Models

arXivby [Submitted on 26 Mar 2026 (this version), latest version 27 Mar 2026 (v2)]March 26, 20262 min read1 views
Source Quiz

On-device Vision-Language Models (VLMs) promise data privacy via local execution. However, we show that the architectural shift toward Dynamic High-Resolution preprocessing (e.g., AnyRes) introduces an inherent algorithmic side-channel. Unlike static models, dynamic preprocessing decomposes images into a variable number of patches based on their aspect ratio, creating workload-dependent inputs. We demonstrate a dual-layer attack framework against local VLMs. In Tier 1, an unprivileged attacker can exploit significant execution-time variations using standard unprivileged OS metrics to reliably — Eyal Hadad, Mordechai Guri

View PDF HTML (experimental)

Abstract:On-device Vision-Language Models (VLMs) promise data privacy via local execution. However, we show that the architectural shift toward Dynamic High-Resolution preprocessing (e.g., AnyRes) introduces an inherent algorithmic side-channel. Unlike static models, dynamic preprocessing decomposes images into a variable number of patches based on their aspect ratio, creating workload-dependent inputs. We demonstrate a dual-layer attack framework against local VLMs. In Tier 1, an unprivileged attacker can exploit significant execution-time variations using standard unprivileged OS metrics to reliably fingerprint the input's geometry. In Tier 2, by profiling Last-Level Cache (LLC) contention, the attacker can resolve semantic ambiguity within identical geometries, distinguishing between visually dense (e.g., medical X-rays) and sparse (e.g., text documents) content. By evaluating state-of-the-art models such as LLaVA-NeXT and Qwen2-VL, we show that combining these signals enables reliable inference of privacy-sensitive contexts. Finally, we analyze the security engineering trade-offs of mitigating this vulnerability, reveal substantial performance overhead with constant-work padding, and propose practical design recommendations for secure Edge AI deployments.

Comments: 13 pages, 8 figures

Subjects:

Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Cite as: arXiv:2603.25403 [cs.CR]

(or arXiv:2603.25403v1 [cs.CR] for this version)

https://doi.org/10.48550/arXiv.2603.25403

arXiv-issued DOI via DataCite

Submission history

From: Eyal Hadad [view email] [v1] Thu, 26 Mar 2026 12:53:49 UTC (5,693 KB) [v2] Fri, 27 Mar 2026 15:01:28 UTC (5,694 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Shape and S…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 177 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers