Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction

About

Leveraging physical knowledge described by partial differential equations (PDEs) is an appealing way to improve unsupervised video prediction methods. Since physics is too restrictive for describing the full visual content of generic videos, we introduce PhyDNet, a two-branch deep architecture, which explicitly disentangles PDE dynamics from unknown complementary information. A second contribution is to propose a new recurrent physical cell (PhyCell), inspired from data assimilation techniques, for performing PDE-constrained prediction in latent space. Extensive experiments conducted on four various datasets show the ability of PhyDNet to outperform state-of-the-art methods. Ablation studies also highlight the important gain brought out by both disentanglement and PDE-constrained prediction. Finally, we show that PhyDNet presents interesting features for dealing with missing data and long-term forecasting.

Vincent Le Guen, Nicolas Thome• 2020

Related benchmarks

Task	Dataset	Result
Video Prediction	KTH 10 -> 20 steps (test)	PSNR23.41	102
Video Prediction	Moving MNIST	SSIM0.947	83
Video Prediction	Moving MNIST (test)	MSE24.4	82
Human Motion Prediction	Human3.6M	--	50
Precipitation forecasting	SEVIR (test)	MSE4.8165	47
Spatio-temporal forecasting	TaxiBJ	MSE0.3622	45
Precipitation nowcasting	MeteoNet	SSIM0.7823	42
Precipitation nowcasting	SEVIR	TFLOPs0.08	41
Video Prediction	Moving-MNIST 10 → 10 (test)	MSE24.4	39
Video Prediction	KTH	PSNR28.01	35

Showing 10 of 57 rows

Other info

Code

Follow for update

@wizwand_team Discord