CoPS: Conditional Prompt Synthesis for Zero-Shot Anomaly Detection
About
Recently, large pre-trained vision-language models have shown remarkable performance in zero-shot anomaly detection (ZSAD). With fine-tuning on a single auxiliary dataset, the model enables cross-category anomaly detection on diverse datasets covering industrial defects and medical lesions. Compared to manually designed prompts, prompt learning eliminates the need for expert knowledge and trial-and-error. However, it still faces the following challenges: (i) static learnable tokens struggle to capture the continuous and diverse patterns of normal and anomalous states, limiting generalization to unseen categories; (ii) fixed textual labels provide overly sparse category information, making the model prone to overfitting to a specific semantic subspace. To address these issues, we propose Conditional Prompt Synthesis (CoPS), a novel framework that synthesizes dynamic prompts conditioned on visual features to enhance ZSAD performance. Specifically, we extract representative normal and anomaly prototypes from fine-grained patch features and explicitly inject them into prompts, enabling adaptive state modeling. Given the sparsity of class labels, we leverage a variational autoencoder to model semantic image features and implicitly fuse varied class tokens into prompts. Additionally, integrated with our spatially-aware alignment mechanism, extensive experiments demonstrate that CoPS surpasses state-of-the-art methods by 1.4% in classification AUROC and 1.9% in segmentation AUROC across 13 industrial and medical datasets. The code is available at https://github.com/cqylunlun/CoPS.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Anomaly Detection | VisA | -- | 261 | |
| Anomaly Segmentation | MVTec AD | -- | 105 | |
| Anomaly Detection | MPDD | -- | 62 | |
| Anomaly Segmentation | BTAD | Average Pixel AUROC94.6 | 48 | |
| Anomaly Detection | Br35H | AUROC98.7 | 45 | |
| Anomaly Segmentation | MPDD | AUROC0.975 | 44 | |
| Anomaly Detection | BTAD | AUROC93.6 | 41 | |
| Pixel-level Anomaly Detection | ColonDB | -- | 39 | |
| Anomaly Segmentation | Kvasir | AP51.5 | 20 | |
| Anomaly Segmentation | VisA | AUC-P95.7 | 15 |