Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Restoration-Oriented Video Frame Interpolation with Region-Distinguishable Priors from SAM

About

In existing restoration-oriented Video Frame Interpolation (VFI) approaches, the motion estimation between neighboring frames plays a crucial role. However, the estimation accuracy in existing methods remains a challenge, primarily due to the inherent ambiguity in identifying corresponding areas in adjacent frames for interpolation. Therefore, enhancing accuracy by distinguishing different regions before motion estimation is of utmost importance. In this paper, we introduce a novel solution involving the utilization of open-world segmentation models, e.g., SAM2 (Segment Anything Model2) for frames, to derive Region-Distinguishable Priors (RDPs) in different frames. These RDPs are represented as spatial-varying Gaussian mixtures, distinguishing an arbitrary number of areas with a unified modality. RDPs can be integrated into existing motion-based VFI methods to enhance features for motion estimation, facilitated by our designed play-and-plug Hierarchical Region-aware Feature Fusion Module (HRFFM). HRFFM incorporates RDP into various hierarchical stages of VFI's encoder, using RDP-guided Feature Normalization (RDPFN) in a residual learning manner. With HRFFM and RDP, the features within VFI's encoder exhibit similar representations for matched regions in neighboring frames, thus improving the synthesis of intermediate frames. Extensive experiments demonstrate that HRFFM consistently enhances VFI performance across various scenes.

Yan Han, Xiaogang Xu, Yingqi Lin, Jiafei Wu, Zhe Liu, Ming-Hsuan Yang• 2023

Related benchmarks

TaskDatasetResultRank
Video Frame InterpolationVimeo90K (test)
PSNR36.69
153
Video Frame InterpolationSNU-FILM Extreme
PSNR25.94
10
Video Frame InterpolationSNU-FILM Extreme
FVD176.8
2
Video Frame InterpolationLAVIB
PSNR31.97
2
Showing 4 of 4 rows

Other info

Follow for update