Stochastic Video Generation with a Learned Prior

About

Generating video frames that accurately predict future world states is challenging. Existing approaches either fail to capture the full distribution of outcomes, or yield blurry generations, or both. In this paper we introduce an unsupervised video generation model that learns a prior model of uncertainty in a given environment. Video frames are generated by drawing samples from this prior and combining them with a deterministic estimate of the future frame. The approach is simple and easily trained end-to-end on a variety of datasets. Sample generations are both varied and sharp, even many frames into the future, and compare favorably to those from existing approaches.

Remi Denton, Rob Fergus• 2018

Related benchmarks

Task	Dataset	Result
Video Prediction	Moving MNIST	SSIM0.907	83
Video Prediction	BAIR (test)	FVD255	59
Video Prediction	KTH	PSNR28.06	35
Video Prediction	BAIR Push (test)	FVD256.6	30
Video Prediction	KTH (test)	FVD157.9	24
Video Generation	Bair	FVD Score270	22
Future video prediction	BAIR 64x64 and 256x256 (test)	FVD315	16
Video Prediction	Human3.6M	SSIM0.893	16
Video modeling	BAIR Robot Pushing (test)	FVD262.5	14
Video Prediction	BAIR 64x64	FVD315	14

Showing 10 of 29 rows

Other info

Follow for update

@wizwand_team Discord