Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Bidirectional Multimodal Prompt Learning with Scale-Aware Training for Few-Shot Multi-Class Anomaly Detection

About

Few-shot multi-class anomaly detection is crucial in real industrial settings, where only a few normal samples are available while numerous object types must be inspected. This setting is challenging as defect patterns vary widely across categories while normal samples remain scarce. Existing vision-language model-based approaches typically depend on class-specific anomaly descriptions or auxiliary modules, limiting both scalability and computational efficiency. In this work, we propose AnoPLe, a lightweight multimodal prompt learning framework that removes reliance on anomaly-type textual descriptions and avoids any external modules. AnoPLe employs bidirectional interactions between textual and visual prompts, allowing class semantics and instance-level cues to refine one another and form class-conditioned representations that capture shared normal patterns across categories. To enhance localization, we design a scale-aware prefix trained on both global and local views, enabling the prompts to capture both global context and fine-grained details. In addition, alignment loss propagates local anomaly evidence to global features, strengthening the consistency between pixel- and image-level predictions. Despite its simplicity, AnoPLe achieves strong performance on MVTec-AD, VisA, and Real-IAD under the few-shot multi-class setting, surpassing prior approaches while remaining efficient and free from expert-crafted anomaly descriptions. Moreover, AnoPLe generalizes well to unseen anomalies and extends effectively to the medical domain.

Yujin Lee, Sewon Kim, Daeun Moon, Seoyoon Jang, Hyunsoo Yoon• 2024

Related benchmarks

TaskDatasetResultRank
Anomaly LocalizationMVTec AD
Pixel AUROC96.5
513
Anomaly DetectionMVTec AD
Image-level AUROC96.4
52
Anomaly DetectionVisA--
52
Anomaly LocalizationReal-IAD
P-AUROC97.4
43
Anomaly LocalizationVisA--
35
Anomaly DetectionRetina OCT
Image-level AUROC0.914
22
Anomaly DetectionReal-IAD
AUROC (Image-level)0.832
18
Anomaly DetectionBMAD Liver CT
I-AUC74.8
6
Anomaly LocalizationBMAD Brain MRI
P-AUC97.1
6
Anomaly LocalizationBMAD Retinal OCT
P-AUC97
6
Showing 10 of 12 rows

Other info

Follow for update