Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WD-FQDet: Multispectral Detection Transformer via Wavelet Decomposition and Frequency-aware Query Learning

About

Infrared-visible object detection improves detection performance by combining complementary features from multispectral images. Existing backbone-specific and backbone-shared approaches still suffer from the problems of severe bias of modality-shared features and the insufficiency of modality-specific features. To address these issues, we propose a novel detection framework WD-FQDet that explicitly decouples modality-shared and modality-specific information from infrared and visible modalities in the new view of low- and high-frequency domains, allowing fusion strategies tailored to their frequency characteristics. Specifically, a low-frequency homogeneity alignment module is proposed to align modality-shared features across modalities via a cross-modal attention mechanism, and a high-frequency specificity retention module is proposed to preserve modality-specific features through the multi-scale gradient consistency loss. To reinforce the feature representation in the frequency domain, we propose a hybrid feature enhancement module that incorporates spatial cues. Furthermore, considering that the contributions of homogeneous and modality-specific features to object detection vary across scenarios, we propose a frequency-aware query selection module to dynamically regulate their contributions. Experimental results on the FLIR, LLVIP, and M3FD datasets demonstrate that WD-FQDet achieves state-of-the-art performance across multiple evaluation metrics.

Chunjin Yang, Xiwei Zhang, Yiming Xiao, Fanman Meng• 2026

Related benchmarks

TaskDatasetResultRank
Object DetectionFLIR
mAP87
65
Object DetectionLLVIP (test)
mAP5098.2
64
Object DetectionFLIR Aligned (test)
mAP@0.587
40
Multispectral Object DetectionM3FD (val)
mAP5073.7
9
Showing 4 of 4 rows

Other info

Follow for update