Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

One Model for ALL: Low-Level Task Interaction Is a Key to Task-Agnostic Image Fusion

About

Advanced image fusion methods mostly prioritise high-level missions, where task interaction struggles with semantic gaps, requiring complex bridging mechanisms. In contrast, we propose to leverage low-level vision tasks from digital photography fusion, allowing for effective feature interaction through pixel-level supervision. This new paradigm provides strong guidance for unsupervised multimodal fusion without relying on abstract semantics, enhancing task-shared feature learning for broader applicability. Owning to the hybrid image features and enhanced universal representations, the proposed GIFNet supports diverse fusion tasks, achieving high performance across both seen and unseen scenarios with a single model. Uniquely, experimental results reveal that our framework also supports single-modality enhancement, offering superior flexibility for practical applications. Our code will be available at https://github.com/AWCXV/GIFNet.

Chunyang Cheng, Tianyang Xu, Zhenhua Feng, Xiaojun Wu, ZhangyongTang, Hui Li, Zeyang Zhang, Sara Atito, Muhammad Awais, Josef Kittler• 2025

Related benchmarks

TaskDatasetResultRank
Semantic segmentationMSRS
mIoU53.7
93
Infrared-Visible Image FusionRoadScene (test)
Visual Information Fidelity (VIF)0.48
53
Object DetectionM3FD
AP@[0.5:0.95]50.6
45
Visible-Infrared Image FusionMSRS (test)--
43
Infrared-Visible Image FusionMSRS
QAB/F (Quality Assessment Block/Fusion)0.4542
38
Infrared-Visible Image FusionLLVIP (test)
EN7.01
36
Multi-Focus Image FusionMFFW
QMI0.696
22
Multi-Focus Image FusionLytro
Qabf (Quality Index based on Fusion)0.5194
20
Multi-Focus Image FusionRealMFF
Qabf (Quality Index)0.4834
20
Infrared-Visible Video FusionVTMOT 2025 (test)
BiSWE10.239
13
Showing 10 of 51 rows

Other info

Follow for update