Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PixelSNAIL: An Improved Autoregressive Generative Model

About

Autoregressive generative models consistently achieve the best results in density estimation tasks involving high dimensional data, such as images or audio. They pose density estimation as a sequence modeling task, where a recurrent neural network (RNN) models the conditional distribution over the next element conditioned on all previous elements. In this paradigm, the bottleneck is the extent to which the RNN can model long-range dependencies, and the most successful approaches rely on causal convolutions, which offer better access to earlier parts of the sequence than conventional RNNs. Taking inspiration from recent work in meta reinforcement learning, where dealing with long-range dependencies is also essential, we introduce a new generative model architecture that combines causal convolutions with self attention. In this note, we describe the resulting model and present state-of-the-art log-likelihood results on CIFAR-10 (2.85 bits per dim) and $32 \times 32$ ImageNet (3.80 bits per dim). Our implementation is available at https://github.com/neocxi/pixelsnail-public

Xi Chen, Nikhil Mishra, Mostafa Rohaninejad, Pieter Abbeel• 2017

Related benchmarks

TaskDatasetResultRank
Image GenerationCIFAR-10 (test)--
471
Density EstimationCIFAR-10 (test)
Bits/dim2.85
134
Generative ModelingCIFAR-10 (test)
NLL (bits/dim)2.85
62
Density EstimationImageNet 64x64 (test)
Bits Per Sub-Pixel3.52
62
Generative ModelingCIFAR-10
BPD2.85
46
Density EstimationCIFAR-10
bpd2.85
40
Image ModelingCIFAR-10 (test)
NLL (bits/dim)2.85
36
Unconditional Image GenerationCIFAR10
BPD2.85
33
Unconditional Image GenerationImageNet-32
BPD3.8
31
Generative ModelingImageNet 32x32 downsampled
Bits Per Dimension3.8
24
Showing 10 of 21 rows

Other info

Follow for update