Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Adversarial Diffusion Distillation

About

We introduce Adversarial Diffusion Distillation (ADD), a novel training approach that efficiently samples large-scale foundational image diffusion models in just 1-4 steps while maintaining high image quality. We use score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal in combination with an adversarial loss to ensure high image fidelity even in the low-step regime of one or two sampling steps. Our analyses show that our model clearly outperforms existing few-step methods (GANs, Latent Consistency Models) in a single step and reaches the performance of state-of-the-art diffusion models (SDXL) in only four steps. ADD is the first method to unlock single-step, real-time image synthesis with foundation models. Code and weights available under https://github.com/Stability-AI/generative-models and https://huggingface.co/stabilityai/ .

Axel Sauer, Dominik Lorenz, Andreas Blattmann, Robin Rombach• 2023

Related benchmarks

TaskDatasetResultRank
Text-to-Image GenerationGenEval
Overall Score55
467
Text-to-Image GenerationGenEval
GenEval Score54
277
Text-to-Image GenerationT2I-CompBench (test)
Color Accuracy61.49
67
Text-to-Image GenerationGenEval 1.0 (test)
Overall Score47.66
63
Text-to-Image GenerationMS COCO zero-shot
FID16.25
42
Text-to-Image GenerationHPSv2
HPSv2 Score29.93
35
Text-to-Image GenerationOneIG-Bench
Alignment0.791
33
Text-to-Image GenerationCOCO 30k
FID23.19
29
Text-to-Image GenerationCOCO 2014 (val)
Precision65
25
Text-to-Image GenerationMS-COCO 10K prompts 2014 (val)
FID26.7
19
Showing 10 of 17 rows

Other info

Follow for update