AA-CLIP: Enhancing Zero-shot Anomaly Detection via Anomaly-Aware CLIP
About
Anomaly detection (AD) identifies outliers for applications like defect and lesion detection. While CLIP shows promise for zero-shot AD tasks due to its strong generalization capabilities, its inherent Anomaly-Unawareness leads to limited discrimination between normal and abnormal features. To address this problem, we propose Anomaly-Aware CLIP (AA-CLIP), which enhances CLIP's anomaly discrimination ability in both text and visual spaces while preserving its generalization capability. AA-CLIP is achieved through a straightforward yet effective two-stage approach: it first creates anomaly-aware text anchors to differentiate normal and abnormal semantics clearly, then aligns patch-level visual features with these anchors for precise anomaly localization. This two-stage strategy, with the help of residual adapters, gradually adapts CLIP in a controlled manner, achieving effective AD while maintaining CLIP's class knowledge. Extensive experiments validate AA-CLIP as a resource-efficient solution for zero-shot AD tasks, achieving state-of-the-art results in industrial and medical applications. The code is available at https://github.com/Mwxinnn/AA-CLIP.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Anomaly Localization | MVTec AD | Pixel AUROC91.9 | 513 | |
| Anomaly Detection | VisA | AUROC84.6 | 261 | |
| Image-level Anomaly Detection | MVTec AD | AUROC90.9 | 82 | |
| Image-level Anomaly Detection | VisA | AUC79.2 | 80 | |
| Anomaly Detection | MVTec | AUROC90.5 | 79 | |
| Anomaly Localization | MVTec | AUC91.9 | 78 | |
| Anomaly Classification | LiverCT | AUC69.7 | 72 | |
| Anomaly Detection | MPDD | Clean AUROC0.783 | 62 | |
| 3D Anomaly Detection | Real3D-AD | Average O-AUROC0.748 | 56 | |
| Anomaly Detection | DTD | AUROC93.3 | 55 |