AnomalyDINO: Boosting Patch-based Few-shot Anomaly Detection with DINOv2

About

Recent advances in multimodal foundation models have set new standards in few-shot anomaly detection. This paper explores whether high-quality visual features alone are sufficient to rival existing state-of-the-art vision-language models. We affirm this by adapting DINOv2 for one-shot and few-shot anomaly detection, with a focus on industrial applications. We show that this approach does not only rival existing techniques but can even outmatch them in many settings. Our proposed vision-only approach, AnomalyDINO, follows the well-established patch-level deep nearest neighbor paradigm, and enables both image-level anomaly prediction and pixel-level anomaly segmentation. The approach is methodologically simple and training-free and, thus, does not require any additional data for fine-tuning or meta-learning. The approach is methodologically simple and training-free and, thus, does not require any additional data for fine-tuning or meta-learning. Despite its simplicity, AnomalyDINO achieves state-of-the-art results in one- and few-shot anomaly detection (e.g., pushing the one-shot performance on MVTec-AD from an AUROC of 93.1% to 96.6%). The reduced overhead, coupled with its outstanding few-shot performance, makes AnomalyDINO a strong candidate for fast deployment, e.g., in industrial contexts.

Simon Damm, Mike Laszkiewicz, Johannes Lederer, Asja Fischer• 2024

Related benchmarks

Task	Dataset	Result
Anomaly Localization	MVTec AD	Pixel AUROC98.1	543
Anomaly Detection	MVTec-AD (test)	I-AUROC94.2	348
Anomaly Detection	VisA	AUROC92.6	293
Anomaly Detection	VisA (test)	--	178
Anomaly Detection	RESC	--	67
Anomaly Detection	MVTec AD	Image-level AUROC97.7	52
Anomaly Detection	BraTS 2021	--	50
Anomaly Detection	MVTec AD	I-AUROC96.8	43
Anomaly Localization	VisA	AUROC98.2	38
Anomaly Detection and Classification	MVTec FS	I-AUC97.5	24

Showing 10 of 47 rows

Other info

Code

Follow for update

@wizwand_team Discord