Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

EvaNet: Towards More Efficient and Consistent Infrared and Visible Image Fusion Assessment

About

Evaluation is essential in image fusion research, yet most existing metrics are directly borrowed from other vision tasks without proper adaptation. These traditional metrics, often based on complex image transformations, not only fail to capture the true quality of the fusion results but also are computationally demanding. To address these issues, we propose a unified evaluation framework specifically tailored for image fusion. At its core is a lightweight network designed efficiently to approximate widely used metrics, following a divide-and-conquer strategy. Unlike conventional approaches that directly assess similarity between fused and source images, we first decompose the fusion result into infrared and visible components. The evaluation model is then used to measure the degree of information preservation in these separated components, effectively disentangling the fusion evaluation process. During training, we incorporate a contrastive learning strategy and inform our evaluation model by perceptual scene assessment provided by a large language model. Last, we propose the first consistency evaluation framework, which measures the alignment between image fusion metrics and human visual perception, using both independent no-reference scores and downstream tasks performance as objective references. Extensive experiments show that our learning-based evaluation paradigm delivers both superior efficiency (up to 1,000 times faster) and greater consistency across a range of standard image fusion benchmarks. Our code will be publicly available at https://github.com/AWCXV/EvaNet.

Chunyang Cheng, Tianyang Xu, Xiao-Jun Wu, Tao Zhou, Hui Li, Zhangyong Tang, Josef Kittler• 2026

Related benchmarks

TaskDatasetResultRank
Image Fusion EvaluationLLVIP
Inference Time6
3
Image Fusion Metric Consistency AssessmentLLVIP Dataset
VIF72.4
3
Image Fusion Metric Consistency AssessmentTNO Dataset
VIF (MCdeep)63.8
3
Object DetectionDownstream Detection Dataset
VIF (Downstream Dataset)0.752
3
Image Fusion Metric Consistency AssessmentRoadScene Dataset
VIF0.629
3
Reference-free Image Quality AssessmentReference-free Image Fusion Quality Assessment Dataset
EN0.676
2
Semantic segmentationDownstream Segmentation
VIF Score68.2
2
Image Fusion Evaluation SpeedLLVIP part (test)--
1
Image Fusion Evaluation SpeedLLVIP all (test)--
1
Showing 9 of 9 rows

Other info

Follow for update