Models model training announce available feature arxiv

SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation

arXiv cs.CRby He Yang, Dongyi Lv, Song Ma, Wei Xi, Jizhong ZhaoApril 1, 20261 min read0 views

arXiv:2603.28824v1 Announce Type: new Abstract: Dataset condensation aims to synthesize compact yet informative datasets that retain the training efficacy of full-scale data, offering substantial gains in efficiency. Recent studies reveal that the condensation process can be vulnerable to backdoor attacks, where malicious triggers are injected into the condensation dataset, manipulating model behavior during inference. While prior approaches have made progress in balancing attack success rate and clean test accuracy, they often fall short in preserving stealthiness, especially in concealing the visual artifacts of condensed data or the perturbations introduced during inference. To address this challenge, we introduce Sneakdoor, which enhances stealthiness without compromising attack effect

View PDF HTML (experimental)

Abstract:Dataset condensation aims to synthesize compact yet informative datasets that retain the training efficacy of full-scale data, offering substantial gains in efficiency. Recent studies reveal that the condensation process can be vulnerable to backdoor attacks, where malicious triggers are injected into the condensation dataset, manipulating model behavior during inference. While prior approaches have made progress in balancing attack success rate and clean test accuracy, they often fall short in preserving stealthiness, especially in concealing the visual artifacts of condensed data or the perturbations introduced during inference. To address this challenge, we introduce Sneakdoor, which enhances stealthiness without compromising attack effectiveness. Sneakdoor exploits the inherent vulnerability of class decision boundaries and incorporates a generative module that constructs input-aware triggers aligned with local feature geometry, thereby minimizing detectability. This joint design enables the attack to remain imperceptible to both human inspection and statistical detection. Extensive experiments across multiple datasets demonstrate that Sneakdoor achieves a compelling balance among attack success rate, clean test accuracy, and stealthiness, substantially improving the invisibility of both the synthetic data and triggered samples while maintaining high attack efficacy. The code is available at this https URL.

Comments: 29 pages, 5 figures, accepted to NeurIPS 2025

Subjects:

Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.28824 [cs.CR]

(or arXiv:2603.28824v1 [cs.CR] for this version)

https://doi.org/10.48550/arXiv.2603.28824

arXiv-issued DOI via DataCite

Submission history

From: He Yang [view email] [v1] Sun, 29 Mar 2026 09:00:25 UTC (2,762 KB)

Original source

arXiv cs.CR

https://arxiv.org/abs/2603.28824

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modeltrainingannounce

Open Source AILive

it looks like it will be soon 💎💎💎💎

https://github.com/ggml-org/llama.cpp/pull/21309 (thanks rerri ) from HF https://github.com/huggingface/transformers/pull/45192 [Gemma 4](INSET_PAPER_LINK) is a multimodal model with pretrained and instruction-tuned variants, available in 1B, 13B, and 27B parameters. The architecture is mostly the same as the previous Gemma versions. The key differences are a vision processor that can output images of fixed token budget and a spatial 2D RoPE to encode vision-specific information across height and width axis. this PR probably only applies to dense, so it must be separate for MoE submitted by /u/jacek2023 [link] [comments]

Reddit r/LocalLLaMA

1mabout 1 hour ago

Self-Evolving AILive

Google launches Gemma 4, its "most intelligent" open model family, purpose-built for advanced reasoning and agentic workflows, under an Apache 2.0 license (The Keyword)

The Keyword : Google launches Gemma 4, its most intelligent open model family, purpose-built for advanced reasoning and agentic workflows, under an Apache 2.0 license C O Group Product Manager, Google DeepMind Today, we are introducing Gemma 4 our most intelligent open models to date.

Techmeme

1m16 minutes ago

Open Source AILive

Gemma 4 released

Blog: https://deepmind.google/models/gemma/ Models: - Gemma4-2B: https://huggingface.co/google/gemma-4-E2B-it - Gemma4-4B: https://huggingface.co/google/gemma-4-E4B-it - Gemma4-26B-A4B: https://huggingface.co/google/gemma-4-26B-A4B-it - Gemma4-31B: https://huggingface.co/google/gemma-4-31B-it The GGUF versions can be found here: https://huggingface.co/collections/unsloth/gemma-4 https://preview.redd.it/j7c0107ewssg1.png?width=1552 format=png auto=webp s=1c47b1d9986c42a6cb1f81d73c142863586b1fd6 submitted by /u/garg-aayush [link] [comments]

Reddit r/LocalLLaMA

1m22 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 169 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsLive

Deep Learning Weekly: Issue 449

Gemini 3.1 Flash Live, Cohere Transcribe: state-of-the-art speech recognition, a paper on IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse, and many more!

Deep Learning Weekly

1mabout 1 hour ago

ModelsRecent

Europe needs AI cloud infrastructure: Mistral raises $830m for data centre near Paris - MSN

Europe needs AI cloud infrastructure: Mistral raises $830m for data centre near Paris MSN

Google News - Mistral AI France

1m1 day ago

Models

Claude Sonnet 4.5 takes reshuffles the AI coding leaderboard with token efficiency, long-horizon task performance

Claude Sonnet 4.5 offers incremental improvements, impressive long-term coding capabilities, and key token efficiencies to reclaim its enterprise market leadership. The post Claude Sonnet 4.5 takes reshuffles the AI coding leaderboard with token efficiency, long-horizon task performance first appeared on TechTalks .

TechTalks

1m6 months ago

Models

Inside Claude Skills: Anthropic s new pattern for customizing LLMs

Anthropic is moving beyond complex protocols with Claude Skills, a system that uses simple folders and scripts to transform generalist LLMs into multi-skilled agents. The post Inside Claude Skills: Anthropic’s new pattern for customizing LLMs first appeared on TechTalks .

TechTalks

1m5 months ago