Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WMoE-CLIP: Wavelet-Enhanced Mixture-of-Experts Prompt Learning for Zero-Shot Anomaly Detection

About

Vision-language models have recently shown strong generalization in zero-shot anomaly detection (ZSAD), enabling the detection of unseen anomalies without task-specific supervision. However, existing approaches typically rely on fixed textual prompts, which struggle to capture complex semantics, and focus solely on spatial-domain features, limiting their ability to detect subtle anomalies. To address these challenges, we propose a wavelet-enhanced mixture-of-experts prompt learning method for ZSAD. Specifically, a variational autoencoder is employed to model global semantic representations and integrate them into prompts to enhance adaptability to diverse anomaly patterns. Wavelet decomposition extracts multi-frequency image features that dynamically refine textual embeddings through cross-modal interactions. Furthermore, a semantic-aware mixture-of-experts module is introduced to aggregate contextual information. Extensive experiments on 14 industrial and medical datasets demonstrate the effectiveness of the proposed method.

Peng Chen, Chao Huang• 2026

Related benchmarks

TaskDatasetResultRank
Anomaly LocalizationMVTec AD
Pixel AUROC92.1
513
Anomaly DetectionVisA
AUROC87.3
261
Anomaly DetectionBr35H
AUROC98.1
45
Anomaly DetectionBTAD
AUROC92.6
41
Pixel-level Anomaly DetectionColonDB
AUROC84.3
39
Image-level Anomaly DetectionHeadCT
AUROC98.2
37
Image-level Anomaly DetectionDTD Synthetic
AUROC95
31
Anomaly LocalizationBTAD
AUROC93.3
29
Anomaly DetectionMVTec AD--
29
Anomaly LocalizationDAGM
AUROC99.5
26
Showing 10 of 20 rows

Other info

Follow for update