Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Efficient Feature Extraction for High-resolution Video Frame Interpolation

About

Most deep learning methods for video frame interpolation consist of three main components: feature extraction, motion estimation, and image synthesis. Existing approaches are mainly distinguishable in terms of how these modules are designed. However, when interpolating high-resolution images, e.g. at 4K, the design choices for achieving high accuracy within reasonable memory requirements are limited. The feature extraction layers help to compress the input and extract relevant information for the latter stages, such as motion estimation. However, these layers are often costly in parameters, computation time, and memory. We show how ideas from dimensionality reduction combined with a lightweight optimization can be used to compress the input representation while keeping the extracted information suitable for frame interpolation. Further, we require neither a pretrained flow network nor a synthesis network, additionally reducing the number of trainable parameters and required memory. When evaluating on three 4K benchmarks, we achieve state-of-the-art image quality among the methods without pretrained flow while having the lowest network complexity and memory requirements overall.

Moritz Nottebaum, Stefan Roth, Simone Schaub-Meyer• 2022

Related benchmarks

TaskDatasetResultRank
Multi-frame Video InterpolationX 4K (test)
PSNR30.46
43
Video Frame InterpolationX 2K (test)
PSNR31.12
29
Video Frame InterpolationXiph-2k
PSNR34.8
29
Video Frame InterpolationXiph 4K (test)
PSNR34.16
25
Video Frame InterpolationX-L 2K (test)
PSNR29.9
13
Video Frame InterpolationX L 4K (test)
PSNR29.3
12
Video Frame InterpolationX (test)
PSNR30.45
8
Video Frame InterpolationInter4K-L
SSIM0.904
5
Video Frame InterpolationInter4K-S (test)
PSNR29.29
5
Video Frame InterpolationInter4K L (test)
PSNR25.16
5
Showing 10 of 12 rows

Other info

Code

Follow for update