Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion

About

Infrared-visible object detection aims to achieve robust even full-day object detection by fusing the complementary information of infrared and visible images. However, highly dynamically variable complementary characteristics and commonly existing modality misalignment make the fusion of complementary information difficult. In this paper, we propose a Dynamic Adaptive Multispectral Detection Transformer (DAMSDet) to simultaneously address these two challenges. Specifically, we propose a Modality Competitive Query Selection strategy to provide useful prior information. This strategy can dynamically select basic salient modality feature representation for each object. To effectively mine the complementary information and adapt to misalignment situations, we propose a Multispectral Deformable Cross-attention module to adaptively sample and aggregate multi-semantic level features of infrared and visible images for each object. In addition, we further adopt the cascade structure of DETR to better mine complementary information. Experiments on four public datasets of different scenes demonstrate significant improvements compared to other state-of-the-art methods. The code will be released at https://github.com/gjj45/DAMSDet.

Junjie Guo, Chenqiang Gao, Fangcen Liu, Deyu Meng, Xinbo Gao• 2024

Related benchmarks

TaskDatasetResultRank
Object DetectionFLIR (test)
mAP500.866
83
Object DetectionM3FD dataset
mAP@0.580.2
48
Object DetectionFLIR
mAP57.08
40
Object DetectionM3FD
mAP50 Person18.15
16
Object DetectionM3FD
AP (Person)23.25
16
Object DetectionM3FD
mAP52.9
9
Object DetectionFLIR 5-shot (test)
mAP5040.63
8
Object DetectionFLIR 10-shot (test)
mAP5047.4
8
Showing 8 of 8 rows

Other info

Follow for update