Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Parallel Multiscale Autoregressive Density Estimation

About

PixelCNN achieves state-of-the-art results in density estimation for natural images. Although training is fast, inference is costly, requiring one network evaluation per pixel; O(N) for N pixels. This can be sped up by caching activations, but still involves generating each pixel sequentially. In this work, we propose a parallelized PixelCNN that allows more efficient inference by modeling certain pixel groups as conditionally independent. Our new PixelCNN model achieves competitive density estimation and orders of magnitude speedup - O(log N) sampling instead of O(N) - enabling the practical generation of 512x512 images. We evaluate the model on class-conditional image generation, text-to-image synthesis, and action-conditional video generation, showing that our model achieves the best results among non-pixel-autoregressive density models that allow efficient sampling.

Scott Reed, A\"aron van den Oord, Nal Kalchbrenner, Sergio G\'omez Colmenarejo, Ziyu Wang, Dan Belov, Nando de Freitas• 2017

Related benchmarks

TaskDatasetResultRank
Density EstimationImageNet 32x32 (test)
Bits per Sub-pixel3.95
66
Density EstimationImageNet 64x64 (test)
Bits Per Sub-Pixel3.7
62
Unconditional Image GenerationImageNet-32
BPD3.95
31
Generative ModelingImageNet 32x32 downsampled
Bits Per Dimension3.95
24
Unconditional Image GenerationImageNet 64
BPD3.7
22
Unconditional image modelingImageNet 64x64
Bits/Dim3.7
17
Density EstimationImageNet 64
Bits-per-dimension3.7
16
Density EstimationImageNet 64x64 (val)
Bits/dim3.7
13
SamplingImageNet 32x32
Sampling Time (s)1.17
9
Unconditional image modelingImageNet 32x32
Bits/Dim3.95
8
Showing 10 of 19 rows

Other info

Follow for update