Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Enhancing Video Super-Resolution via Implicit Resampling-based Alignment

About

In video super-resolution, it is common to use a frame-wise alignment to support the propagation of information over time. The role of alignment is well-studied for low-level enhancement in video, but existing works overlook a critical step -- resampling. We show through extensive experiments that for alignment to be effective, the resampling should preserve the reference frequency spectrum while minimizing spatial distortions. However, most existing works simply use a default choice of bilinear interpolation for resampling even though bilinear interpolation has a smoothing effect and hinders super-resolution. From these observations, we propose an implicit resampling-based alignment. The sampling positions are encoded by a sinusoidal positional encoding, while the value is estimated with a coordinate network and a window-based cross-attention. We show that bilinear interpolation inherently attenuates high-frequency information while an MLP-based coordinate network can approximate more frequencies. Experiments on synthetic and real-world datasets show that alignment with our proposed implicit resampling enhances the performance of state-of-the-art frameworks with minimal impact on both compute and parameters.

Kai Xu, Ziwei Yu, Xin Wang, Michael Bi Mi, Angela Yao• 2023

Related benchmarks

TaskDatasetResultRank
Video Super-ResolutionREDS4 4x (test)
PSNR32.9
96
Video Super-ResolutionREDS4
SSIM0.9138
82
Video Super-ResolutionVid4 Y (test)
PSNR29.68
30
4x Video Super-ResolutionVimeo-90K-T (test)
PSNR38.14
28
Video Super-ResolutionREDS4 RGB (test)
PSNR32.9
25
4x Video Super-ResolutionREDS4 (test)
PSNR32.9
24
Video Super-ResolutionSDSD-out
PSNR24.15
24
Video Super-ResolutionSDE out
PSNR19.82
24
Video Super-ResolutionSDE-in
PSNR19.33
24
Video Super-ResolutionSDSD-in
PSNR23.74
24
Showing 10 of 30 rows

Other info

Follow for update