Video Frame Interpolation via Adaptive Convolution

About

Video frame interpolation typically involves two steps: motion estimation and pixel synthesis. Such a two-step approach heavily depends on the quality of motion estimation. This paper presents a robust video frame interpolation method that combines these two steps into a single process. Specifically, our method considers pixel synthesis for the interpolated frame as local convolution over two input frames. The convolution kernel captures both the local motion between the input frames and the coefficients for pixel synthesis. Our method employs a deep fully convolutional neural network to estimate a spatially-adaptive convolution kernel for each pixel. This deep neural network can be directly trained end to end using widely available video data without any difficult-to-obtain ground-truth data like optical flow. Our experiments show that the formulation of video interpolation as a single convolution process allows our method to gracefully handle challenges like occlusion, blur, and abrupt brightness change and enables high-quality video frame interpolation.

Simon Niklaus, Long Mai, Feng Liu• 2017

Related benchmarks

Task	Dataset	Result
Video Frame Interpolation	Vimeo90K (test)	PSNR32.33	153
Video Frame Interpolation	Middlebury	--	42
Video Frame Interpolation	Middlebury (M.B.) 2 (other)	IE2.27	16
Video Frame Interpolation	UCF101 middle timestep 53	PSNR34.78	16
Video Frame Interpolation	Vimeo90K 62 (test)	PSNR33.79	16
Video Interpolation	GoPro 15 frames skips (test)	PSNR23.23	14
Video Frame Interpolation	HD middle timestep 3	PSNR30.87	12
Video Frame Interpolation	Adobe240 8 fps	PSNR19.88	10
Video Interpolation	KTH 64 x 64 (test)	PSNR29.21	9
Video Interpolation	SMMNIST 64 x 64 (test)	PSNR14.759	9

Showing 10 of 22 rows

Other info

Follow for update

@wizwand_team Discord