Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Task-driven Image Fusion with Learnable Fusion Loss

About

Multi-modal image fusion aggregates information from multiple sensor sources, achieving superior visual quality and perceptual features compared to single-source images, often improving downstream tasks. However, current fusion methods for downstream tasks still use predefined fusion objectives that potentially mismatch the downstream tasks, limiting adaptive guidance and reducing model flexibility. To address this, we propose Task-driven Image Fusion (TDFusion), a fusion framework incorporating a learnable fusion loss guided by task loss. Specifically, our fusion loss includes learnable parameters modeled by a neural network called the loss generation module. This module is supervised by the downstream task loss in a meta-learning manner. The learning objective is to minimize the task loss of fused images after optimizing the fusion module with the fusion loss. Iterative updates between the fusion module and the loss module ensure that the fusion network evolves toward minimizing task loss, guiding the fusion process toward the task objectives. TDFusion's training relies entirely on the downstream task loss, making it adaptable to any specific task. It can be applied to any architecture of fusion and task networks. Experiments demonstrate TDFusion's performance through fusion experiments conducted on four different datasets, in addition to evaluations on semantic segmentation and object detection tasks.

Haowen Bai, Jiangshe Zhang, Zixiang Zhao, Yichen Wu, Lilun Deng, Yukun Cui, Tao Feng, Shuang Xu• 2024

Related benchmarks

TaskDatasetResultRank
Object DetectionLLVIP
mAP5095
104
Semantic segmentationMSRS
mIoU75.09
68
Salient Object DetectionVT5000--
50
Semantic segmentationFMB
mIoU0.605
49
Infrared-Visible Image FusionMSRS
QAB/F (Quality Assessment Block/Fusion)0.677
38
Object DetectionM3FD
AP@[0.5:0.95]62.87
35
Infrared and Visible Image FusionMSRS 361 image pairs (test)
Entropy (EN)6.734
14
Infrared and Visible Image FusionIVOE (176 image pairs)
EN7.338
14
Infrared and Visible Image FusionFMB 280 image pairs
Entropy (EN)6.987
14
Video FusionVTMOT
QG58.15
13
Showing 10 of 18 rows

Other info

Code

Follow for update