Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessAfter a 23% Plunge in the First Quarter, Can Microsoft’s AI Story Continue? - NAI500GNews AI MicrosoftAI Video Generation Startup Runway Unveils $10 Mn VC Fund To Back Early-stage AI Startups: Report - bwdisrupt.comGNews AI startupsOracle layoffs: 12,000 jobs cut in India amid AI push, more layoffs likely - Storyboard18GNews AI IndiaIs Arista Networks (ANET) Becoming NVIDIA’s Go-To AI Network Spine or Just One Key Partner? - simplywall.stGNews AI NVIDIAZhipu's Stock Soars After Chinese AI Startup's Annual Revenue More Than Doubles - Yicai GlobalGNews AI ChinaAustralia signs AI MoU with Anthropic, flags data centre investment - W.MediaGNews AI AustraliaHong Kong hasn’t issued a single HKD stablecoin license after March targetCoinDesk AIBitcoin is closer to its 'buy zone' than it's been in three yearsCoinDesk AIRAG Web Browser: Give Your AI Real-Time Web Access Without HallucinationsDEV CommunityWhat Nobody Tells You About Building a Protocol for AI AgentsDEV CommunityHuawei highlights AI, HarmonyOS and auto momentum in 2025 annual report - TechNodeGNews AI HuaweiThe Evidence Is in the Phone. Most of It Never Makes It Into the Case.DEV CommunityBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessAfter a 23% Plunge in the First Quarter, Can Microsoft’s AI Story Continue? - NAI500GNews AI MicrosoftAI Video Generation Startup Runway Unveils $10 Mn VC Fund To Back Early-stage AI Startups: Report - bwdisrupt.comGNews AI startupsOracle layoffs: 12,000 jobs cut in India amid AI push, more layoffs likely - Storyboard18GNews AI IndiaIs Arista Networks (ANET) Becoming NVIDIA’s Go-To AI Network Spine or Just One Key Partner? - simplywall.stGNews AI NVIDIAZhipu's Stock Soars After Chinese AI Startup's Annual Revenue More Than Doubles - Yicai GlobalGNews AI ChinaAustralia signs AI MoU with Anthropic, flags data centre investment - W.MediaGNews AI AustraliaHong Kong hasn’t issued a single HKD stablecoin license after March targetCoinDesk AIBitcoin is closer to its 'buy zone' than it's been in three yearsCoinDesk AIRAG Web Browser: Give Your AI Real-Time Web Access Without HallucinationsDEV CommunityWhat Nobody Tells You About Building a Protocol for AI AgentsDEV CommunityHuawei highlights AI, HarmonyOS and auto momentum in 2025 annual report - TechNodeGNews AI HuaweiThe Evidence Is in the Phone. Most of It Never Makes It Into the Case.DEV Community

Bidirectional Multimodal Prompt Learning with Scale-Aware Training for Few-Shot Multi-Class Anomaly Detection

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2408.13516v2 Announce Type: replace-cross Abstract: Few-shot multi-class anomaly detection is crucial in real industrial settings, where only a few normal samples are available while numerous object types must be inspected. This setting is challenging as defect patterns vary widely across categories while normal samples remain scarce. Existing vision-language model-based approaches typically depend on class-specific anomaly descriptions or auxiliary modules, limiting both scalability and computational efficiency. In this work, we propose AnoPLe, a lightweight multimodal prompt learning f — Yujin Lee, Sewon Kim, Daeun Moon, Seoyoon Jang, Hyunsoo Yoon

View PDF HTML (experimental)

Abstract:Few-shot multi-class anomaly detection is crucial in real industrial settings, where only a few normal samples are available while numerous object types must be inspected. This setting is challenging as defect patterns vary widely across categories while normal samples remain scarce. Existing vision-language model-based approaches typically depend on class-specific anomaly descriptions or auxiliary modules, limiting both scalability and computational efficiency. In this work, we propose AnoPLe, a lightweight multimodal prompt learning framework that removes reliance on anomaly-type textual descriptions and avoids any external modules. AnoPLe employs bidirectional interactions between textual and visual prompts, allowing class semantics and instance-level cues to refine one another and form class-conditioned representations that capture shared normal patterns across categories. To enhance localization, we design a scale-aware prefix trained on both global and local views, enabling the prompts to capture both global context and fine-grained details. In addition, alignment loss propagates local anomaly evidence to global features, strengthening the consistency between pixel- and image-level predictions. Despite its simplicity, AnoPLe achieves strong performance on MVTec-AD, VisA, and Real-IAD under the few-shot multi-class setting, surpassing prior approaches while remaining efficient and free from expert-crafted anomaly descriptions. Moreover, AnoPLe generalizes well to unseen anomalies and extends effectively to the medical domain.

Comments: accepted to CVPR 2026

Subjects:

Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Cite as: arXiv:2408.13516 [cs.CV]

(or arXiv:2408.13516v2 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2408.13516

arXiv-issued DOI via DataCite

Submission history

From: Yujin Lee [view email] [v1] Sat, 24 Aug 2024 08:41:19 UTC (46,204 KB) [v2] Mon, 30 Mar 2026 03:34:35 UTC (11,492 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Bidirection…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 100 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers