AA-CLIP: Enhancing Zero-shot Anomaly Detection via Anomaly-Aware CLIP
About
Anomaly detection (AD) identifies outliers for applications like defect and lesion detection. While CLIP shows promise for zero-shot AD tasks due to its strong generalization capabilities, its inherent Anomaly-Unawareness leads to limited discrimination between normal and abnormal features. To address this problem, we propose Anomaly-Aware CLIP (AA-CLIP), which enhances CLIP's anomaly discrimination ability in both text and visual spaces while preserving its generalization capability. AA-CLIP is achieved through a straightforward yet effective two-stage approach: it first creates anomaly-aware text anchors to differentiate normal and abnormal semantics clearly, then aligns patch-level visual features with these anchors for precise anomaly localization. This two-stage strategy, with the help of residual adapters, gradually adapts CLIP in a controlled manner, achieving effective AD while maintaining CLIP's class knowledge. Extensive experiments validate AA-CLIP as a resource-efficient solution for zero-shot AD tasks, achieving state-of-the-art results in industrial and medical applications. The code is available at https://github.com/Mwxinnn/AA-CLIP.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Anomaly Localization | MVTec AD | Pixel AUROC91.9 | 534 | |
| Anomaly Detection | VisA | AUROC84.6 | 293 | |
| Anomaly Detection | VisA (test) | -- | 148 | |
| Anomaly Detection | MVTec | AUROC90.5 | 105 | |
| Anomaly Detection | MPDD (test) | Image-level AU-ROC77 | 104 | |
| Anomaly Detection | MVTec AD | -- | 92 | |
| Anomaly Detection | BraTS | Image-level AUROC57.647 | 90 | |
| Image-level Anomaly Detection | MVTec AD | AUROC90.9 | 82 | |
| Image-level Anomaly Detection | VisA | AUC79.2 | 80 | |
| Anomaly Localization | MVTec | AUC91.9 | 78 |