Bidirectional Multimodal Prompt Learning with Scale-Aware Training for Few-Shot Multi-Class Anomaly Detection

About

Few-shot multi-class anomaly detection is crucial in real industrial settings, where only a few normal samples are available while numerous object types must be inspected. This setting is challenging as defect patterns vary widely across categories while normal samples remain scarce. Existing vision-language model-based approaches typically depend on class-specific anomaly descriptions or auxiliary modules, limiting both scalability and computational efficiency. In this work, we propose AnoPLe, a lightweight multimodal prompt learning framework that removes reliance on anomaly-type textual descriptions and avoids any external modules. AnoPLe employs bidirectional interactions between textual and visual prompts, allowing class semantics and instance-level cues to refine one another and form class-conditioned representations that capture shared normal patterns across categories. To enhance localization, we design a scale-aware prefix trained on both global and local views, enabling the prompts to capture both global context and fine-grained details. In addition, alignment loss propagates local anomaly evidence to global features, strengthening the consistency between pixel- and image-level predictions. Despite its simplicity, AnoPLe achieves strong performance on MVTec-AD, VisA, and Real-IAD under the few-shot multi-class setting, surpassing prior approaches while remaining efficient and free from expert-crafted anomaly descriptions. Moreover, AnoPLe generalizes well to unseen anomalies and extends effectively to the medical domain.

Yujin Lee, Sewon Kim, Daeun Moon, Seoyoon Jang, Hyunsoo Yoon• 2024

Related benchmarks

Task	Dataset	Result
Anomaly Localization	MVTec AD	Pixel AUROC96.5	534
Anomaly Detection	VisA	AUROC (Image-level)87.5	79
Anomaly Detection	MVTec AD	Image-level AUROC96.4	52
Anomaly Localization	Real-IAD	P-AUROC97.4	43
Anomaly Localization	VisA	--	38
Anomaly Detection	Retina OCT	Image-level AUROC0.914	22
Anomaly Detection	Real-IAD	AUROC (Image-level)0.832	18
Anomaly Detection	BMAD Liver CT	I-AUC74.8	6
Anomaly Localization	BMAD Brain MRI	P-AUC97.1	6
Anomaly Localization	BMAD Retinal OCT	P-AUC97	6

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord