Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DRFusion: Drift-Resilient Temporally Consistent Infrared-Visible Video Fusion

About

Infrared and visible video fusion is essential for achieving comprehensive perception in dynamic scenes. However, maintaining temporal consistency remains a formidable challenge. Conventional methods relying on optical flow often suffer from geometric rigidity and ghosting artifacts. Moreover, standard diffusion-based fusion models typically operate in a frame-by-frame manner; when extended to autoregressive settings, they lack intrinsic temporal constraints and are prone to severe error accumulation and drifting, where minor artifacts amplify over time. To address these limitations, we propose a drift-resilient video fusion method that reformulates the task as history-conditioned motion generation. We introduce Stabilized History Guidance and Soft Temporal Anchoring to reframe temporal consistency as spectral filtering, implicitly aggregating motion dynamics without rigid alignment. Furthermore, our Decoupled Structure-Motion Adaptation strategy bridges pre-trained priors and structural constraints via two-stage training and latent refinement. Extensive experiments demonstrate that our method achieves state-of-the-art performance in both fusion quality and temporal stability.

Xingyuan Li, Haoyuan Xu, Shulin Li, Xiang Chen, Zhiying Jiang, Jinyuan Liu• 2026

Related benchmarks

TaskDatasetResultRank
Infrared-Visible Video FusionHDO
CC0.682
13
Infrared-Visible Video FusionNOT-156
CC0.458
13
Infrared-Visible Video FusionVTMOT
Contrast Contribution (CC)0.639
13
Object TrackingNOT-156
AUC25.2
13
Infrared-Visible Video FusionM3SVD
CC59.4
13
Infrared-Visible Video FusionNOT-156 2025 (test)
BiSWE4.816
13
Infrared-Visible Video FusionHDO 2024 (test)
BiSWE6.225
13
Infrared-Visible Video FusionM3SVD 2025 (test)
BiSWE6.467
13
Infrared-Visible Video FusionVTMOT 2025 (test)
BiSWE7.418
13
Showing 9 of 9 rows

Other info

Follow for update