Classifier-Free Diffusion Guidance
About
Classifier guidance is a recently introduced method to trade off mode coverage and sample fidelity in conditional diffusion models post training, in the same spirit as low temperature sampling or truncation in other types of generative models. Classifier guidance combines the score estimate of a diffusion model with the gradient of an image classifier and thereby requires training an image classifier separate from the diffusion model. It also raises the question of whether guidance can be performed without a classifier. We show that guidance can be indeed performed by a pure generative model without such a classifier: in what we call classifier-free guidance, we jointly train a conditional and an unconditional diffusion model, and we combine the resulting conditional and unconditional score estimates to attain a trade-off between sample quality and diversity similar to that obtained using classifier guidance.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text-to-Image Generation | GenEval | Overall Score58.36 | 467 | |
| Class-conditional Image Generation | ImageNet 256x256 (val) | FID2.3 | 293 | |
| Image Generation | ImageNet 64x64 | FID16.8 | 114 | |
| Conditional Image Generation | ImageNet-1K 256x256 (val) | -- | 86 | |
| Class-conditional Image Generation | ImageNet 512x512 (val) | FID (Val)3.08 | 69 | |
| Class-conditional image synthesis | ImageNet 256x256 (val) | FID1.89 | 61 | |
| Image Generation | ImageNet 64x64 (val) | FID16.8 | 48 | |
| Text-to-Image Generation | Pick-a-Pic | PickScore22.34 | 47 | |
| Text-to-Image Generation | DrawBench | Pick Score23.13 | 40 | |
| Class-conditional Image Generation | ImageNet 128x128 | FID2.43 | 27 |