ShortcutBreaker: Low-Rank Noisy Bottleneck and Frequency Filtering Block for Multi-Class Unsupervised Anomaly Detection
arXiv:2510.18342v2 Announce Type: replace Abstract: Multi-class unsupervised anomaly detection (MUAD) has garnered growing research interest, as it seeks to develop a unified model for anomaly detection across multiple classes, i.e., eliminating the need to train separate models for distinct objects and thereby saving substantial computational resources. Under the MUAD setting, while advanced Transformer-based architectures have brought significant performance improvements, identity shortcuts persist: they directly copy inputs to outputs, narrowing the gap in reconstruction errors between norm — Peng Tang, Xiaobin Hu, Tingcheng Li, Yang Nan, Tobias Lasser, Hongwei Bran Li
View PDF HTML (experimental)
Abstract:Multi-class unsupervised anomaly detection (MUAD) has garnered growing research interest, as it seeks to develop a unified model for anomaly detection across multiple classes, i.e., eliminating the need to train separate models for distinct objects and thereby saving substantial computational resources. Under the MUAD setting, while advanced Transformer-based architectures have brought significant performance improvements, identity shortcuts persist: they directly copy inputs to outputs, narrowing the gap in reconstruction errors between normal and abnormal cases, and thereby making the two harder to distinguish. Therefore, we propose ShortcutBreaker, a novel unified feature-reconstruction framework for MUAD tasks, featuring two key innovations to address the issue of shortcuts. First, drawing on matrix rank inequality, we design a low-rank noisy bottleneck (LRNB) to project highdimensional features into a low-rank latent space, and theoretically demonstrate its capacity to prevent trivial identity reproduction. Second, leveraging ViTs global modeling capability instead of merely focusing on local features, we incorporate a global perturbation attention to prevent information shortcuts in the decoders. Extensive experiments are performed on four widely used anomaly detection benchmarks, including three industrial datasets (MVTec-AD, ViSA, and Real-IAD) and one medical dataset (Universal Medical). The proposed method achieves a remarkable image-level AUROC of 99.8%, 98.9%, 90.6%, and 87.8% on these four datasets, respectively, consistently outperforming previous MUAD methods across different scenarios Our code will be released..
Comments: Under Review
Subjects:
Artificial Intelligence (cs.AI)
Cite as: arXiv:2510.18342 [cs.AI]
(or arXiv:2510.18342v2 [cs.AI] for this version)
https://doi.org/10.48550/arXiv.2510.18342
arXiv-issued DOI via DataCite
Submission history
From: Peng Tang [view email] [v1] Tue, 21 Oct 2025 06:51:30 UTC (17,238 KB) [v2] Sat, 28 Mar 2026 03:30:52 UTC (27,212 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivExclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ
<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxPTHlYUm1vbnBJRVd0a2VFOWVrNDJsMGxKZHZjUlJRc0wwLWxpNmJZVkZjcXo5dkViM0xKclVXbjFPS3BHUkZsNzVxbUgwUmJTZHlSVnkzSHQzc3BlS2toeUMzaHl6SUJjTnJ5ZHpJX3B5M3FfV3NmS1NKUVFRLWM5VVl0T2RmdjFpVnVzQkJFbG56MUFuRk1vWWhrZVR6LWRpYlNsZ0hUNWpZc1FYeGZWU2tidzc5WXdrZFFnUHBVRmZZRkFPY0ZKTVZJdnExQVhwY21yMy01QlRBUnJyWXFEd3gzOWNYSGZSd2xqcHV5aHJFcl9Mb0ZheFR6TmVzRE9NZGdvczNtRndfTmpEYXZHYlJCUkJmQ3daY2h3Zi1XcGxJaWF2bHo0WEwwSTZNMkhJeVpkN1NFQVU0dkFZbVE1bVlTT3ozay1aWVZjcndhaXBEdHAwSHlGYkRLdjlXQnNmSjUxa21iRGVEeEJmNDZGUTNxdG96OGFtUmxjVUNvamRoaGMxeGUzOEpsWGJTT0pjN1B1bkNVanlqaWd5QVVPWllVdERYVjMtaThMWlpFVUFOSWdxTWNDYw?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> <font color="#6f6f6f">WSJ</font>
Perplexity launches Secure Intelligence Institute to advance AI security, privacy, and safety research - Moneycontrol.com
<a href="https://news.google.com/rss/articles/CBMi9AFBVV95cUxQMG5ITUdubnA4ankwSVo4djlMckJIM2V3cVpTSjA3ZFhVVTc0dHNCMlBJV3dNUVpsZ1lhcEM2aFpkLUs4Ym9IRXZqZFI4OVpwa3E3bnFvNS1uQk5vOVVJYnZ0TGNSQ080VlJERVNlaUt3WVE1WWJfUDlOS0JTOVc4Mk96Rmc1OEp2TmktOEVCODRqTEdpajBGb0JVQmUwNlVjdE96U3U5MERHQm1fWUFMbUhseXluVGZJaUdpQXQxT0s3SzdjdTVLampLbmI4Vnd1a0E3MC1VYVF0STBKTGNMZmhFaE9YVlF6X3dFWHJCenpsNzZW0gH6AUFVX3lxTE56aE9Dc3VlNGw4cGhFRzRHTmhuMXplZTY2QVBsYVJPWVhWclhWbWZLT2xtNllaOU1VMnRVYm9KZUV5anpUbW0wX011T01pcjFpT3hOcWFreFQ1bUVjdFoyZ0pGVGxsazNKRUxKVnAzU1dhbEV3ZmlRZmxYeFNkcmpkZXkyTEtjZXotZno3YjdWSW01RGo0R2RHWGIzMTh6VUpvUFBDbEt5ODAtYTZqOHl3ZzFMVkRLV0ZwVmp3ZWZFWW1VYVFyLTYxa3dvU2ozNEJqRENwcG45WEVJUmtZUVprTFdud3lWN1hHODRiZmtINlh4SmtSZVNhNWc?oc=5" target="_blank">Perplexity launches Secure Intelligence Institute to advance
Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ
<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxQa2VLbUtXOHVWRUhpVXVlV1hxYWNWeWJpTkR1M2ZQLWNuWjVVLW9LX2tEWmF6RzRnaXQ3dnIzam9Vd1pmaWR4VlY3Z0V6SWVRMXl6UTJaZndWanl5bHE5N2RST3FpdmFMTld1TklVWU9KNW5fVWo0bEo4OW1ta0RkaVpLSUpISWNlQlM2QmVHSnhNVVFnem5RVm85M2lLekdVekxRd3ktanNrQ2hna1l1d19Xd3JOSVRLQ3BnbFZ3Q2xJUHNxWVZ4Wlc0ekFWN2oxdFBVTm8xWXBQY3k0T1FXM3BlU0NsbHYzR0UzUjRxRmRpODVDWktIeFUzaXJweDZ6WXc2VE0wYVI5MjdicVZaSXVsdmRUeUhXTmFSZFFiRzJHdnNzNzk3aEhVVDU2dHdlMXVEeHJQUTM3d2JtT3Fjb2NwUDUwdEtIMXR2VzNUMGI2NjVHMEMxcUwzNXE3VzFnZ1R4TE92cXhxU1d5MGdMZXRvQVNWdV83SF9aZFY5QXNxVHlkZ0o3MVZrXzVtb2hPYXZ6UmZfM1o3WkV6emwzdkpRLW5yLXRVcmZaSWdaeVpHZ09zX3pSeA?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> <font color="#6f6f6f">WSJ</font>
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers

Multi-Layered Memory Architectures for LLM Agents: An Experimental Evaluation of Long-Term Context Retention
arXiv:2603.29194v1 Announce Type: new Abstract: Long-horizon dialogue systems suffer from semanticdrift and unstable memory retention across extended sessions. This paper presents a Multi-Layer Memory Framework that decomposes dialogue history into working, episodic, and semantic layers with adaptive retrieval gating and retention regularization. The architecture controls cross-session drift while maintaining bounded context growth and computational efficiency. Experiments on LOCOMO, LOCCO, and LoCoMo show improved performance, achieving 46.85 Success Rate, 0.618 overall F1 with 0.594 multi-hop F1, and 56.90% six-period retention while reducing false memory rate to 5.1% and context usage to 58.40%. Results confirm enhanced long-term retention and reasoning stability under constrained conte

3D Architect: An Automated Approach to Three-Dimensional Modeling
arXiv:2603.29191v1 Announce Type: new Abstract: The aim of our paper is to render an object in 3-dimension using a set of its orthographic views. Corner detector (Harris Detector) is applied on the input views to obtain control points. These control points are projected perpendicular to respective views, in order to construct an envelope. A set of points describing the object in 3-dimension, are obtained from the intersection of these mutually perpendicular envelopes. These set of points are used to regenerate the surfaces of the object using computational geometry. At the end, the object in 3-dimension is rendered using OpenGL

SLVMEval: Synthetic Meta Evaluation Benchmark for Text-to-Long Video Generation
arXiv:2603.29186v1 Announce Type: new Abstract: This paper proposes the synthetic long-video meta-evaluation (SLVMEval), a benchmark for meta-evaluating text-to-video (T2V) evaluation systems. The proposed SLVMEval benchmark focuses on assessing these systems on videos of up to 10,486 s (approximately 3 h). The benchmark targets a fundamental requirement, namely, whether the systems can accurately assess video quality in settings that are easy for humans to assess. We adopt a pairwise comparison-based meta-evaluation framework. Building on dense video-captioning datasets, we synthetically degrade source videos to create controlled "high-quality versus low-quality" pairs across 10 distinct aspects. Then, we employ crowdsourcing to filter and retain only those pairs in which the degradation

Developing a Guideline for the Labovian-Structural Analysis of Oral Narratives in Japanese
arXiv:2603.29347v1 Announce Type: new Abstract: Narrative analysis is a cornerstone of qualitative research. One leading approach is the Labovian model, but its application is labor-intensive, requiring a holistic, recursive interpretive process that moves back and forth between individual parts of the transcript and the transcript as a whole. Existing Labovian datasets are available only in English, which differs markedly from Japanese in terms of grammar and discourse conventions. To address this gap, we introduce the first systematic guidelines for Labovian narrative analysis of Japanese narrative data. Our guidelines retain all six Labovian categories and extend the framework by providing explicit rules for clause segmentation tailored to Japanese constructions. In addition, our guidel

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!