Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Myriad: Large Multimodal Model by Applying Vision Experts for Industrial Anomaly Detection

About

Due to the training configuration, traditional industrial anomaly detection (IAD) methods have to train a specific model for each deployment scenario, which is insufficient to meet the requirements of modern design and manufacturing. On the contrary, large multimodal models~(LMMs) have shown eminent generalization ability on various vision tasks, and their perception and comprehension capabilities imply the potential of applying LMMs on IAD tasks. However, we observe that even though the LMMs have abundant knowledge about industrial anomaly detection in the textual domain, the LMMs are unable to leverage the knowledge due to the modality gap between textual and visual domains. To stimulate the relevant knowledge in LMMs and adapt the LMMs towards anomaly detection tasks, we introduce existing IAD methods as vision experts and present a novel large multimodal model applying vision experts for industrial anomaly detection~(abbreviated to {Myriad}). Specifically, we utilize the anomaly map generated by the vision experts as guidance for LMMs, such that the vision model is guided to pay more attention to anomalous regions. Then, the visual features are modulated via an adapter to fit the anomaly detection tasks, which are fed into the language model together with the vision expert guidance and human instructions to generate the final outputs. Extensive experiments are applied on MVTec-AD, VisA, and PCB Bank benchmarks demonstrate that our proposed method not only performs favorably against state-of-the-art methods, but also inherits the flexibility and instruction-following ability of LMMs in the field of IAD. Source code and pre-trained models are publicly available at \url{https://github.com/tzjtatata/Myriad}.

Yuanze Li, Haolin Wang, Shihao Yuan, Ming Liu, Debin Zhao, Yiwen Guo, Chen Xu, Guangming Shi, Wangmeng Zuo• 2023

Related benchmarks

TaskDatasetResultRank
Anomaly DetectionVisA--
261
Anomaly SegmentationMVTec AD
AUROC (Pixelwise)0.923
105
Anomaly SegmentationVisA
AUPRO81.3
30
Anomaly DetectionMVTec AD
Img AUROC93.2
16
Anomaly DetectionMVTec AD
Accuracy87.4
10
Anomaly DetectionRIAD
metal_welding Image Score95.2
8
Anomaly SegmentationWFDD
Pixel AUROC93.9
8
Showing 7 of 7 rows

Other info

Follow for update