Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization

About

We present a transformer-based image anomaly detection and localization network. Our proposed model is a combination of a reconstruction-based approach and patch embedding. The use of transformer networks helps to preserve the spatial information of the embedded patches, which are later processed by a Gaussian mixture density network to localize the anomalous areas. In addition, we also publish BTAD, a real-world industrial anomaly dataset. Our results are compared with other state-of-the-art algorithms using publicly available datasets like MNIST and MVTec.

Pankaj Mishra, Riccardo Verk, Daniele Fornasier, Claudio Piciarelli, Gian Luca Foresti• 2021

Related benchmarks

TaskDatasetResultRank
Anomaly LocalizationMVTec-AD (test)--
211
Anomaly SegmentationBTAD
Average Pixel AUROC90
48
Anomaly DetectionBTAD
Average Image-level AUROC83.7
45
Anomaly DetectionBTAD (test)
Mean Pixel AUROC0.9
30
Anomaly LocalizationBTAD--
29
Anomaly LocalizationBTAD (test)
Avg Pixel AUROC90
24
Anomaly DetectionBTAD
PR-AUC99
12
Anomaly ClassificationMNIST
Class 0 AUC0.99
9
Anomaly SegmentationBTAD Category 1 (test)
AUROC76.3
5
Anomaly SegmentationBTAD Category 2 (test)
AUROC88.9
5
Showing 10 of 13 rows

Other info

Code

Follow for update