Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models

About

Diffusion probabilistic models (DPMs) have achieved impressive success in high-resolution image synthesis, especially in recent large-scale text-to-image generation applications. An essential technique for improving the sample quality of DPMs is guided sampling, which usually needs a large guidance scale to obtain the best sample quality. The commonly-used fast sampler for guided sampling is DDIM, a first-order diffusion ODE solver that generally needs 100 to 250 steps for high-quality samples. Although recent works propose dedicated high-order solvers and achieve a further speedup for sampling without guidance, their effectiveness for guided sampling has not been well-tested before. In this work, we demonstrate that previous high-order fast samplers suffer from instability issues, and they even become slower than DDIM when the guidance scale grows large. To further speed up guided sampling, we propose DPM-Solver++, a high-order solver for the guided sampling of DPMs. DPM-Solver++ solves the diffusion ODE with the data prediction model and adopts thresholding methods to keep the solution matches training data distribution. We further propose a multistep variant of DPM-Solver++ to address the instability issue by reducing the effective step size. Experiments show that DPM-Solver++ can generate high-quality samples within only 15 to 20 steps for guided sampling by pixel-space and latent-space DPMs.

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu• 2022

Related benchmarks

TaskDatasetResultRank
Image GenerationImageNet 256x256
FID4.59
243
Unconditional Image GenerationCIFAR-10
FID3.88
171
Text-to-Image GenerationMS-COCO 2014 (val)
FID15.72
128
Image GenerationImageNet 64x64
FID2.7
114
Image GenerationCIFAR-10
FID2.91
95
Unconditional Image GenerationCIFAR-10 32x32 (test)
FID3.42
94
Image GenerationCIFAR10 50k samples (test)
FID2.02
81
Text-to-Image GenerationMS-COCO 2017 (val)
FID20.51
80
Conditional Image GenerationCIFAR-10
FID3.61
71
Image GenerationImageNet 512
FID3.6
57
Showing 10 of 42 rows

Other info

Follow for update