Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking

About

The real world is dynamic, yet most image fusion methods process static frames independently, ignoring temporal correlations in videos and leading to flickering and temporal inconsistency. To address this, we propose Unified Video Fusion (UniVF), a novel and unified framework for video fusion that leverages multi-frame learning and optical flow-based feature warping for informative, temporally coherent video fusion. To support its development, we also introduce Video Fusion Benchmark (VF-Bench), the first comprehensive benchmark covering four video fusion tasks: multi-exposure, multi-focus, infrared-visible, and medical fusion. VF-Bench provides high-quality, well-aligned video pairs obtained through synthetic data generation and rigorous curation from existing datasets, with a unified evaluation protocol that jointly assesses the spatial quality and temporal consistency of video fusion. Extensive experiments show that UniVF achieves state-of-the-art results across all tasks on VF-Bench. Project page: https://vfbench.github.io.

Zixiang Zhao, Haowen Bai, Bingxin Ke, Yukun Cui, Lilun Deng, Yulun Zhang, Kai Zhang, Konrad Schindler• 2025

Related benchmarks

TaskDatasetResultRank
Video FusionVTMOT
QG57.24
13
Infrared-Visible Video FusionM3SVD
CC56
13
Infrared-Visible Video FusionM3SVD 2025 (test)
BiSWE7.316
13
Infrared-Visible Video FusionVTMOT 2025 (test)
BiSWE8.551
13
Infrared-Visible Video FusionHDO
CC0.632
13
Infrared-Visible Video FusionVTMOT
Contrast Contribution (CC)0.59
13
Infrared-Visible Video FusionHDO 2024 (test)
BiSWE6.378
13
Object TrackingNOT-156
AUC21.2
13
Infrared-Visible Video FusionNOT-156
CC0.289
13
Infrared-Visible Video FusionNOT-156 2025 (test)
BiSWE4.69
13
Showing 10 of 20 rows

Other info

Follow for update