Parallel Multiscale Autoregressive Density Estimation

About

PixelCNN achieves state-of-the-art results in density estimation for natural images. Although training is fast, inference is costly, requiring one network evaluation per pixel; O(N) for N pixels. This can be sped up by caching activations, but still involves generating each pixel sequentially. In this work, we propose a parallelized PixelCNN that allows more efficient inference by modeling certain pixel groups as conditionally independent. Our new PixelCNN model achieves competitive density estimation and orders of magnitude speedup - O(log N) sampling instead of O(N) - enabling the practical generation of 512x512 images. We evaluate the model on class-conditional image generation, text-to-image synthesis, and action-conditional video generation, showing that our model achieves the best results among non-pixel-autoregressive density models that allow efficient sampling.

Scott Reed, A\"aron van den Oord, Nal Kalchbrenner, Sergio G\'omez Colmenarejo, Ziyu Wang, Dan Belov, Nando de Freitas• 2017

Related benchmarks

Task	Dataset	Result
Density Estimation	ImageNet 64x64 (test)	Bits Per Sub-Pixel3.7	71
Density Estimation	ImageNet 32x32 (test)	Bits per Sub-pixel3.95	69
Unconditional Image Generation	ImageNet-32	BPD3.95	31
Generative Modeling	ImageNet 32x32 downsampled	Bits Per Dimension3.95	24
Unconditional Image Generation	ImageNet 64	BPD3.7	22
Unconditional image modeling	ImageNet 64x64	Bits/Dim3.7	17
Density Estimation	ImageNet 64	Bits-per-dimension3.7	16
Density Estimation	ImageNet 64x64 (val)	Bits/dim3.7	13
Sampling	ImageNet 32x32	Sampling Time (s)1.17	9
Unconditional image modeling	ImageNet 32x32	Bits/Dim3.95	8

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord