MoTIF: Learning Motion Trajectories with Local Implicit Neural Functions for Continuous Space-Time Video Super-Resolution
About
This work addresses continuous space-time video super-resolution (C-STVSR) that aims to up-scale an input video both spatially and temporally by any scaling factors. One key challenge of C-STVSR is to propagate information temporally among the input video frames. To this end, we introduce a space-time local implicit neural function. It has the striking feature of learning forward motion for a continuum of pixels. We motivate the use of forward motion from the perspective of learning individual motion trajectories, as opposed to learning a mixture of motion trajectories with backward motion. To ease motion interpolation, we encode sparsely sampled forward motion extracted from the input video as the contextual input. Along with a reliability-aware splatting and decoding scheme, our framework, termed MoTIF, achieves the state-of-the-art performance on C-STVSR. The source code of MoTIF is available at https://github.com/sichun233746/MoTIF.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Super-Resolution | UDM10 (test) | PSNR24.97 | 51 | |
| Spatiotemporal Video Super-Resolution | BS-ERGB | PSNR24.21 | 29 | |
| Continuous Space-Time Video Super-Resolution | Standard Clip 180x320 resolution | TFLOPs2.043 | 24 | |
| Space-Time Video Super-Resolution | GoPro Average (test) | PSNR30.04 | 24 | |
| Space-Time Video Super-Resolution | Adobe-Average (test) | PSNR29.82 | 24 | |
| Spatiotemporal Video Super-Resolution | GoPro Center | PSNR31.04 | 15 | |
| Spatiotemporal Video Super-Resolution | Adobe240 Center | PSNR30.63 | 15 | |
| Video Super-Resolution | ALPIX-VSR | PSNR38.61 | 4 |