BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions
About
Existing Video Frame interpolation (VFI) models tend to suffer from time-to-location ambiguity when trained with video of non-uniform motions, such as accelerating, decelerating, and changing directions, which often yield blurred interpolated frames. In this paper, we propose (i) a novel motion description map, Bidirectional Motion field (BiM), to effectively describe non-uniform motions; (ii) a BiM-guided Flow Net (BiMFN) with Content-Aware Upsampling Network (CAUN) for precise optical flow estimation; and (iii) Knowledge Distillation for VFI-centric Flow supervision (KDVCF) to supervise the motion estimation of VFI model with VFI-centric teacher flows. The proposed VFI is called a Bidirectional Motion field-guided VFI (BiM-VFI) model. Extensive experiments show that our BiM-VFI model significantly surpasses the recent state-of-the-art VFI methods by 26% and 45% improvements in LPIPS and STLPIPS respectively, yielding interpolated frames with much fewer blurs at arbitrary time instances.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Frame Interpolation | Vimeo90K (test) | PSNR35.12 | 153 | |
| Video Frame Interpolation | SNU-FILM Extreme | PSNR24.63 | 10 | |
| Video Frame Interpolation | SNU-FILM entire (16x downsampling) | LPIPS0.074 | 5 | |
| Video Frame Interpolation | XTest-entire (16x downsampling) | LPIPS0.055 | 5 | |
| Video Frame Interpolation | SNU-FILM entire 4x downsampling | LPIPS0.032 | 5 | |
| Video Frame Interpolation | SNU-FILM entire 8x downsampling | LPIPS0.046 | 5 |