Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Equivariant Multi-Modality Image Fusion

About

Multi-modality image fusion is a technique that combines information from different sensors or modalities, enabling the fused image to retain complementary features from each modality, such as functional highlights and texture details. However, effective training of such fusion models is challenging due to the scarcity of ground truth fusion data. To tackle this issue, we propose the Equivariant Multi-Modality imAge fusion (EMMA) paradigm for end-to-end self-supervised learning. Our approach is rooted in the prior knowledge that natural imaging responses are equivariant to certain transformations. Consequently, we introduce a novel training paradigm that encompasses a fusion module, a pseudo-sensing module, and an equivariant fusion module. These components enable the net training to follow the principles of the natural sensing-imaging process while satisfying the equivariant imaging prior. Extensive experiments confirm that EMMA yields high-quality fusion results for infrared-visible and medical images, concurrently facilitating downstream multi-modal segmentation and detection tasks. The code is available at https://github.com/Zhaozixiang1228/MMIF-EMMA.

Zixiang Zhao, Haowen Bai, Jiangshe Zhang, Yulun Zhang, Kai Zhang, Shuang Xu, Dongdong Chen, Radu Timofte, Luc Van Gool• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationMFNet (test)
mIoU61.98
134
Semantic segmentationFMB (test)
mIoU56.89
59
Object DetectionLLVIP
mAP5094
58
Object DetectionM3FD dataset
mAP@0.582.9
48
Semantic segmentationMSRS
mIoU74.48
42
Object DetectionM³FD (test)
mAP@0.5 (Full)83.09
34
Infrared and Visible Image FusionTNO image fusion
MI (Mutual Information)2.98
30
Infrared and Visible Image FusionRoadScene
MI3.18
28
Semantic segmentationFMB
mIoU0.5628
26
Infrared-Visible Image FusionMSRS
Entropy (EN)6.73
23
Showing 10 of 21 rows

Other info

Follow for update