PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications
About
PixelCNNs are a recently proposed class of powerful generative models with tractable likelihood. Here we discuss our implementation of PixelCNNs which we make available at https://github.com/openai/pixel-cnn. Our implementation contains a number of modifications to the original model that both simplify its structure and improve its performance. 1) We use a discretized logistic mixture likelihood on the pixels, rather than a 256-way softmax, which we find to speed up training. 2) We condition on whole pixels, rather than R/G/B sub-pixels, simplifying the model structure. 3) We use downsampling to efficiently capture structure at multiple resolutions. 4) We introduce additional short-cut connections to further speed up optimization. 5) We regularize the model using dropout. Finally, we present state-of-the-art log likelihood results on CIFAR-10 to demonstrate the usefulness of these modifications.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Generation | CIFAR-10 (test) | -- | 471 | |
| Density Estimation | CIFAR-10 (test) | Bits/dim2.92 | 134 | |
| Out-of-Distribution Detection | CIFAR-10 | AUROC100 | 105 | |
| Out-of-Distribution Detection | CIFAR-10 (ID) vs SVHN (OOD) (test) | AUROC15.8 | 79 | |
| Density Estimation | ImageNet 32x32 (test) | Bits per Sub-pixel3.77 | 66 | |
| Generative Modeling | CIFAR-10 (test) | NLL (bits/dim)2.92 | 62 | |
| Generative Modeling | CIFAR-10 | BPD2.92 | 46 | |
| Out-of-Distribution Detection | CIFAR-10 vs CIFAR-100 | AUROC52.4 | 41 | |
| Density Estimation | CIFAR-10 | bpd2.92 | 40 | |
| Image Modeling | CIFAR-10 (test) | NLL (bits/dim)2.92 | 36 |