Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Variational Lossy Autoencoder

About

Representation learning seeks to expose certain aspects of observed data in a learned representation that's amenable to downstream tasks like classification. For instance, a good representation for 2D images might be one that describes only global structure and discards information about detailed texture. In this paper, we present a simple but principled method to learn such global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN. Our proposed VAE model allows us to have control over what the global latent code can learn and , by designing the architecture accordingly, we can force the global latent code to discard irrelevant information such as texture in 2D images, and hence the VAE only "autoencodes" data in a lossy fashion. In addition, by leveraging autoregressive models as both prior distribution $p(z)$ and decoding distribution $p(x|z)$, we can greatly improve generative modeling performance of VAEs, achieving new state-of-the-art results on MNIST, OMNIGLOT and Caltech-101 Silhouettes density estimation tasks.

Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskever, Pieter Abbeel• 2016

Related benchmarks

TaskDatasetResultRank
Density EstimationCIFAR-10 (test)
Bits/dim2.95
134
Generative ModelingCIFAR-10 (test)
NLL (bits/dim)2.95
62
Log-likelihood estimationMNIST dynamically binarized (test)
Log-Likelihood78.53
48
Generative ModelingCIFAR-10
BPD2.95
46
Density Estimationbinarized MNIST 28x28 (test)
Test LogL79.03
44
Image ModelingOmniglot (test)
NLL89.83
27
Density EstimationOMNIGLOT dynamically binarized (test)
NLL89.83
16
Generative ModelingMNIST--
10
Showing 8 of 8 rows

Other info

Follow for update