Adversarial Diffusion Distillation

About

We introduce Adversarial Diffusion Distillation (ADD), a novel training approach that efficiently samples large-scale foundational image diffusion models in just 1-4 steps while maintaining high image quality. We use score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal in combination with an adversarial loss to ensure high image fidelity even in the low-step regime of one or two sampling steps. Our analyses show that our model clearly outperforms existing few-step methods (GANs, Latent Consistency Models) in a single step and reaches the performance of state-of-the-art diffusion models (SDXL) in only four steps. ADD is the first method to unlock single-step, real-time image synthesis with foundation models. Code and weights available under https://github.com/Stability-AI/generative-models and https://huggingface.co/stabilityai/ .

Axel Sauer, Dominik Lorenz, Andreas Blattmann, Robin Rombach• 2023

Related benchmarks

Task	Dataset	Result
Text-to-Image Generation	GenEval	Overall Score55	914
Text-to-Image Generation	GenEval	Overall Score55	581
Text-to-Image Generation	GenEval	GenEval Score68.77	459
Text-to-Image Generation	GenEval (test)	--	250
Text-to-Image Generation	MJHQ-30K	Overall FID24.77	239
Text-to-Image Generation	MS-COCO	FID19.4	193
Text-to-Image Generation	DPG-Bench	DPG Score70.94	156
Text-to-Image Generation	HPS v2.1	Overall Score28.66	153
Text-to-Image Generation	GenEval 1.0 (test)	Overall Score47.66	130
Text-to-Image Generation	T2I-CompBench++	Color0.5802	99

Showing 10 of 74 rows

...

Other info

Follow for update

@wizwand_team Discord