Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Stochastic Latent Residual Video Prediction

About

Designing video prediction models that account for the inherent uncertainty of the future is challenging. Most works in the literature are based on stochastic image-autoregressive recurrent networks, which raises several performance and applicability issues. An alternative is to use fully latent temporal models which untie frame synthesis and temporal dynamics. However, no such model for stochastic video prediction has been proposed in the literature yet, due to design and training difficulties. In this paper, we overcome these difficulties by introducing a novel stochastic temporal model whose dynamics are governed in a latent space by a residual update rule. This first-order scheme is motivated by discretization schemes of differential equations. It naturally models video dynamics as it allows our simpler, more interpretable, latent model to outperform prior state-of-the-art methods on challenging datasets.

Jean-Yves Franceschi, Edouard Delasalles, Micka\"el Chen, Sylvain Lamprier, Patrick Gallinari• 2020

Related benchmarks

TaskDatasetResultRank
Video PredictionBAIR (test)
FVD162
59
Video PredictionKTH
PSNR29.69
35
Video PredictionBAIR Push (test)
FVD141.7
30
Video PredictionKTH (test)
FVD222
24
Future video predictionBAIR 64x64 and 256x256 (test)
FVD181
16
Video PredictionBAIR 64x64
FVD181
14
Video SynthesisiPER (test)
FVD245.1
11
Video PredictionMoving MNIST two-digits (test)
PSNR18.25
9
Video PredictionHuman3.6M (test)
FVD174.7
9
Proxy-supervised Video GenerationBAIR 64x64 Full (test)
LPIPS0.491
6
Showing 10 of 16 rows

Other info

Code

Follow for update