MADE: Masked Autoencoder for Distribution Estimation

About

There has been a lot of recent interest in designing neural network models to estimate a distribution from a set of examples. We introduce a simple modification for autoencoder neural networks that yields powerful generative models. Our method masks the autoencoder's parameters to respect autoregressive constraints: each input is reconstructed only from previous inputs in a given ordering. Constrained this way, the autoencoder outputs can be interpreted as a set of conditional probabilities, and their product, the full joint probability. We can also train a single network that can decompose the joint probability in multiple different orderings. Our simple framework can be applied to multiple architectures, including deep ones. Vectorized implementations, such as on GPUs, are simple and fast. Experiments demonstrate that this approach is competitive with state-of-the-art tractable distribution estimators. At test time, the method is significantly faster and scales better than other autoregressive estimators.

Mathieu Germain, Karol Gregor, Iain Murray, Hugo Larochelle• 2015

Related benchmarks

Task	Dataset	Result
Density Estimation	CIFAR-10 (test)	Bits/dim5.67	134
Density Estimation	MNIST (test)	NLL (bits/dim)2.04	69
Density Estimation	binarized MNIST 28x28 (test)	--	44
Unconditional Density Estimation	POWER (test)	Average Test Log Likelihood (nats)0.4	30
Density Estimation	GAS d=8; N=1,052,065 (test)	Avg Test Log-Likelihood8.47	25
Density Estimation	BSDS300 (test)	NLL (nats)-148.8	25
Unconditional Density Estimation	HEPMASS (test)	NLL (nats)20.98	22
Unconditional Density Estimation	MINIBOONE (test)	NLL (nats)15.59	22
Density Estimation	Ocr-letters (test)	--	19
Generative Modeling	MNIST Binary (test)	NLL (nats)86.64	13

Showing 10 of 34 rows

Other info

Code

Follow for update

@wizwand_team Discord