Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HarmoniAD: Harmonizing Local Structures and Global Semantics for Anomaly Detection

About

Anomaly detection is crucial in industrial product quality inspection. Failing to detect tiny defects often leads to serious consequences. Existing methods face a structure-semantics trade-off: structure-oriented models (such as frequency-based filters) are noise-sensitive, while semantics-oriented models (such as CLIP-based encoders) often miss fine details. To address this, we propose HarmoniAD, a frequency-guided dual-branch framework. Features are first extracted by the CLIP image encoder, then transformed into the frequency domain, and finally decoupled into high- and low-frequency paths for complementary modeling of structure and semantics. The high-frequency branch is equipped with a fine-grained structural attention module (FSAM) to enhance textures and edges for detecting small anomalies, while the low-frequency branch uses a global structural context module (GSCM) to capture long-range dependencies and preserve semantic consistency. Together, these branches balance fine detail and global semantics. HarmoniAD further adopts a multi-class joint training strategy, and experiments on MVTec-AD, VisA, and BTAD show state-of-the-art performance with both sensitivity and robustness.

Naiqi Zhang, Chuancheng Shi, Jingtong Dou, Wenhua Wu, Fei Shen, Jianhua Cao• 2026

Related benchmarks

TaskDatasetResultRank
Anomaly DetectionVisA--
52
Anomaly DetectionMVTec-AD Unified setting--
9
Unified Anomaly DetectionBTAD
ROC AUC (Pixel)98.9
8
Showing 3 of 3 rows

Other info

Follow for update