M3-AD: Reflection-aware Multi-modal, Multi-category, and Multi-dimensional Benchmark and Framework for Industrial Anomaly Detection
About
Although multimodal large language models (MLLMs) have advanced industrial anomaly detection toward a zero-shot paradigm, they still tend to produce high-confidence yet unreliable decisions in fine-grained and structurally complex industrial scenarios, and lack effective self-corrective mechanisms. To address this issue, we propose M3-AD, a unified reflection-aware multimodal framework for industrial anomaly detection. M3-AD comprises two complementary data resources: M3-AD-FT, designed for reflection-aligned fine-tuning, and M3-AD-Bench, designed for systematic cross-category evaluation, together providing a foundation for reflection-aware learning and reliability assessment. Building upon this foundation, we propose RA-Monitor, which models reflection as a learnable decision revision process and guides models to perform controlled self-correction when initial judgments are unreliable, thereby improving decision robustness. Extensive experiments conducted on M3-AD-Bench demonstrate that RA-Monitor outperforms multiple open-source and commercial MLLMs in zero-shot anomaly detection and anomaly analysis tasks. Code will be released at https://github.com/Yanhui-Lee/M3-AD.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Industrial Anomaly Detection | M3-AD Texture | Accuracy91.2 | 21 | |
| Industrial Anomaly Detection | M3-AD Workpiece | Accuracy74.3 | 21 | |
| Industrial Anomaly Detection | M3-AD Electronic | Accuracy79.1 | 21 | |
| Industrial Anomaly Detection | M3-AD Average | Accuracy80.6 | 21 | |
| Anomaly Localization | M3-AD Texture Scene | Localization Score78.8 | 19 | |
| Anomaly Localization | M3-AD Workpiece Scene | Localization Score60.7 | 19 | |
| Anomaly Localization | M3-AD Electronic Scene | Localization Score59.1 | 19 | |
| Anomaly Localization | M3-AD Average across scenes | Localization Score65.3 | 19 | |
| Anomaly Type Classification | M3-AD Workpiece Scene | Type Proportion52.2 | 19 | |
| Anomaly Type Classification | M3-AD Electronic Scene | Type Metric58.7 | 19 |