Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multistep Consistency Models

About

Diffusion models are relatively easy to train but require many steps to generate samples. Consistency models are far more difficult to train, but generate samples in a single step. In this paper we propose Multistep Consistency Models: A unification between Consistency Models (Song et al., 2023) and TRACT (Berthelot et al., 2023) that can interpolate between a consistency model and a diffusion model: a trade-off between sampling speed and sampling quality. Specifically, a 1-step consistency model is a conventional consistency model whereas a $\infty$-step consistency model is a diffusion model. Multistep Consistency Models work really well in practice. By increasing the sample budget from a single step to 2-8 steps, we can train models more easily that generate higher quality samples, while retaining much of the sampling speed benefits. Notable results are 1.4 FID on Imagenet 64 in 8 step and 2.1 FID on Imagenet128 in 8 steps with consistency distillation, using simple losses without adversarial training. We also show that our method scales to a text-to-image diffusion model, generating samples that are close to the quality of the original model.

Jonathan Heek, Emiel Hoogeboom, Tim Salimans• 2024

Related benchmarks

TaskDatasetResultRank
Image GenerationImageNet 64x64 resolution (test)
FID1.9
150
Class-conditional Image GenerationImageNet 64x64
FID1.4
126
Class-conditional Image GenerationImageNet 64x64 (test)
FID1.9
86
Class-conditional Image GenerationImageNet 128x128
FID2.1
27
Class-conditional Image GenerationImageNet 128x128 (test val)
FID2.1
7
Showing 5 of 5 rows

Other info

Follow for update