Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Diffusion Models Beat GANs on Image Synthesis

About

We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models. We achieve this on unconditional image synthesis by finding a better architecture through a series of ablations. For conditional image synthesis, we further improve sample quality with classifier guidance: a simple, compute-efficient method for trading off diversity for fidelity using gradients from a classifier. We achieve an FID of 2.97 on ImageNet 128$\times$128, 4.59 on ImageNet 256$\times$256, and 7.72 on ImageNet 512$\times$512, and we match BigGAN-deep even with as few as 25 forward passes per sample, all while maintaining better coverage of the distribution. Finally, we find that classifier guidance combines well with upsampling diffusion models, further improving FID to 3.94 on ImageNet 256$\times$256 and 3.85 on ImageNet 512$\times$512. We release our code at https://github.com/openai/guided-diffusion

Prafulla Dhariwal, Alex Nichol• 2021

Related benchmarks

TaskDatasetResultRank
Class-conditional Image GenerationImageNet 256x256
Inception Score (IS)215.8
441
Image GenerationImageNet 256x256 (val)
FID3.85
307
Class-conditional Image GenerationImageNet 256x256 (train)
IS215.8
305
Class-conditional Image GenerationImageNet 256x256 (val)
FID4.59
293
Image GenerationImageNet 256x256
FID3.94
243
Image GenerationImageNet (val)
FID32.5
198
Image GenerationImageNet 512x512 (val)
FID-50K3.85
184
Class-conditional Image GenerationImageNet 256x256 (train val)
FID3.94
178
Class-conditional Image GenerationImageNet 256x256 (test)
FID3.94
167
Image GenerationImageNet 64x64 resolution (test)
FID2.07
150
Showing 10 of 122 rows
...

Other info

Code

Follow for update