Multistep Consistency Models

About

Diffusion models are relatively easy to train but require many steps to generate samples. Consistency models are far more difficult to train, but generate samples in a single step. In this paper we propose Multistep Consistency Models: A unification between Consistency Models (Song et al., 2023) and TRACT (Berthelot et al., 2023) that can interpolate between a consistency model and a diffusion model: a trade-off between sampling speed and sampling quality. Specifically, a 1-step consistency model is a conventional consistency model whereas a $\infty$-step consistency model is a diffusion model. Multistep Consistency Models work really well in practice. By increasing the sample budget from a single step to 2-8 steps, we can train models more easily that generate higher quality samples, while retaining much of the sampling speed benefits. Notable results are 1.4 FID on Imagenet 64 in 8 step and 2.1 FID on Imagenet128 in 8 steps with consistency distillation, using simple losses without adversarial training. We also show that our method scales to a text-to-image diffusion model, generating samples that are close to the quality of the original model.

Jonathan Heek, Emiel Hoogeboom, Tim Salimans• 2024

Related benchmarks

Task	Dataset	Result
Class-conditional Image Generation	ImageNet 64x64	FID1.4	170
Class-conditional Image Generation	ImageNet 128x128	FID2.1	155
Image Generation	ImageNet 64x64 resolution (test)	FID1.9	150
Class-conditional Image Generation	ImageNet 64x64 (test)	FID1.9	91
Class-conditional Image Generation	ImageNet 128x128 (test val)	FID2.1	7

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord