Composing graphical models with neural networks for structured representations and fast inference
About
We propose a general modeling and inference framework that composes probabilistic graphical models with deep learning methods and combines their respective strengths. Our model family augments graphical structure in latent variables with neural network observation models. For inference, we extend variational autoencoders to use graphical model approximating distributions with recognition networks that output conjugate potentials. All components of these models are learned simultaneously with a single objective, giving a scalable algorithm that leverages stochastic variational inference, natural gradients, graphical model message passing, and the reparameterization trick. We illustrate this framework with several example models and an application to mouse behavioral phenotyping.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Generative Modeling | Human Motion Capture h3.6m | Log Likelihood2.36 | 10 | |
| Interpolation | Human Motion Capture h3.6m | FID (0.0-0.8)28.8 | 10 | |
| Interpolation | WSJ0 Audio Spectrogram | Interpolation FID (0.0-0.8)15 | 10 | |
| Generative Modeling | WSJ0 Audio Spectrogram | Log P(x)1.45 | 10 | |
| reach velocity decoding (smoothing) | monkey reaching | R^287.5 | 7 | |
| reach velocity decoding (prediction) | monkey reaching | R^2-2.4 | 7 | |
| angular velocity decoding (smoothing) | Pendulum | R-squared98.4 | 6 | |
| x-y position decoding (smoothing) | bouncing ball | R^2 Score0.765 | 6 | |
| angular velocity decoding (prediction) | Pendulum | R^2-0.397 | 6 | |
| x-y position decoding (prediction) | bouncing ball | R^2 Score-0.233 | 6 |