Temporal Modulation Network for Controllable Space-Time Video Super-Resolution
About
Space-time video super-resolution (STVSR) aims to increase the spatial and temporal resolutions of low-resolution and low-frame-rate videos. Recently, deformable convolution based methods have achieved promising STVSR performance, but they could only infer the intermediate frame pre-defined in the training stage. Besides, these methods undervalued the short-term motion cues among adjacent frames. In this paper, we propose a Temporal Modulation Network (TMNet) to interpolate arbitrary intermediate frame(s) with accurate high-resolution reconstruction. Specifically, we propose a Temporal Modulation Block (TMB) to modulate deformable convolution kernels for controllable feature interpolation. To well exploit the temporal information, we propose a Locally-temporal Feature Comparison (LFC) module, along with the Bi-directional Deformable ConvLSTM, to extract short-term and long-term motion cues in videos. Experiments on three benchmark datasets demonstrate that our TMNet outperforms previous STVSR methods. The code is available at https://github.com/CS-GangXu/TMNet.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Super-Resolution | Vimeo-90K Fast (test) | PSNR (dB)37.04 | 39 | |
| Video Super-Resolution | Vimeo-90K Slow (test) | PSNR (dB)33.51 | 39 | |
| Video Super-Resolution | Vimeo-90K Medium (test) | PSNR (dB)35.6 | 39 | |
| Video Super-Resolution | Vimeo-90k Fast | PSNR37.04 | 35 | |
| Space-Time Video Super-Resolution | Vid4 | PSNR26.43 | 33 | |
| Space-Time Video Super-Resolution | Vid4 (test) | PSNR26.43 | 31 | |
| Space-Time Video Super-Resolution | GoPro | PSNR30.49 | 30 | |
| Video Super-Resolution | Vimeo-90k Slow | PSNR33.51 | 30 | |
| Video Super-Resolution | Vimeo-90k Medium | PSNR35.6 | 30 | |
| Space-Time Video Super-Resolution | Adobe-Average (test) | PSNR28.3 | 24 |