Conditional Flow Variational Autoencoders for Structured Sequence Prediction

About

Prediction of future states of the environment and interacting agents is a key competence required for autonomous agents to operate successfully in the real world. Prior work for structured sequence prediction based on latent variable models imposes a uni-modal standard Gaussian prior on the latent variables. This induces a strong model bias which makes it challenging to fully capture the multi-modality of the distribution of the future states. In this work, we introduce Conditional Flow Variational Autoencoders (CF-VAE) using our novel conditional normalizing flow based prior to capture complex multi-modal conditional distributions for effective structured sequence prediction. Moreover, we propose two novel regularization schemes which stabilizes training and deals with posterior collapse for stable training and better fit to the target data distribution. Our experiments on three multi-modal structured sequence prediction datasets -- MNIST Sequences, Stanford Drone and HighD -- show that the proposed method obtains state of art results across different evaluation metrics.

Apratim Bhattacharyya, Michael Hanselmann, Mario Fritz, Bernt Schiele, Christoph-Nikolas Straehle• 2019

Related benchmarks

Task	Dataset	Result
Trajectory Prediction	SDD	ADE12.6	64
Future Trajectory Prediction	SDD (Stanford Drone Dataset) (test)	ADE12.6	51
Trajectory Forecasting	Stanford Drone Dataset	Average Displacement Error (ADE)12.6	35
Vehicle Trajectory Prediction	HighD (test)	--	25
Trajectory Prediction	Stanford Drone (test)	minADE (20)12.6	19
Pedestrian trajectory prediction	Stanford Drone Dataset	ADE12.6	17
Stroke completion	MNIST Sequence (test)	CLL Score104.3	8
Trajectory Prediction	Stanford Drone (5-fold cross val)	Error @ 1sec0.7	8
Trajectory Prediction	Stanford Drone (single split)	mADE12.6	5

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord