Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization

About

We present a transformer-based image anomaly detection and localization network. Our proposed model is a combination of a reconstruction-based approach and patch embedding. The use of transformer networks helps to preserve the spatial information of the embedded patches, which are later processed by a Gaussian mixture density network to localize the anomalous areas. In addition, we also publish BTAD, a real-world industrial anomaly dataset. Our results are compared with other state-of-the-art algorithms using publicly available datasets like MNIST and MVTec.

Pankaj Mishra, Riccardo Verk, Daniele Fornasier, Claudio Piciarelli, Gian Luca Foresti• 2021

Related benchmarks

TaskDatasetResultRank
Anomaly LocalizationMVTec-AD (test)--
181
Anomaly DetectionBTAD
Average Image-level AUROC83.7
45
Anomaly SegmentationBTAD
Average Pixel AUROC90
41
Anomaly DetectionBTAD (test)
Mean Pixel AUROC0.9
30
Anomaly LocalizationBTAD--
20
Anomaly LocalizationBTAD (test)
Pixel AUROC (01)99
13
Anomaly DetectionBTAD
PR-AUC99
12
Anomaly ClassificationMNIST
Class 0 AUC0.99
9
Anomaly SegmentationBTAD Category 1 (test)
AUROC76.3
5
Anomaly SegmentationBTAD Category 2 (test)
AUROC88.9
5
Showing 10 of 13 rows

Other info

Code

Follow for update