DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation

About

Diffusion probabilistic models (DPMs) have shown remarkable performance in visual synthesis but are computationally expensive due to the need for multiple evaluations during the sampling. Recent predictor-corrector diffusion samplers have significantly reduced the required number of function evaluations (NFE), but inherently suffer from a misalignment issue caused by the extra corrector step, especially with a large classifier-free guidance scale (CFG). In this paper, we introduce a new fast DPM sampler called DC-Solver, which leverages dynamic compensation (DC) to mitigate the misalignment of the predictor-corrector samplers. The dynamic compensation is controlled by compensation ratios that are adaptive to the sampling steps and can be optimized on only 10 datapoints by pushing the sampling trajectory toward a ground truth trajectory. We further propose a cascade polynomial regression (CPR) which can instantly predict the compensation ratios on unseen sampling configurations. Additionally, we find that the proposed dynamic compensation can also serve as a plug-and-play module to boost the performance of predictor-only samplers. Extensive experiments on both unconditional sampling and conditional sampling demonstrate that our DC-Solver can consistently improve the sampling quality over previous methods on different DPMs with a wide range of resolutions up to 1024$\times$1024. Notably, we achieve 10.38 FID (NFE=5) on unconditional FFHQ and 0.394 MSE (NFE=5, CFG=7.5) on Stable-Diffusion-2.1. Code is available at https://github.com/wl-zhao/DC-Solver

Wenliang Zhao, Haolin Wang, Jie Zhou, Jiwen Lu• 2024

Related benchmarks

Task	Dataset	Result
Conditional Image Generation	ImageNet 256x256 Guided-Diffusion (10k samples)	FID7.46	128
Text-to-Image Generation	Stable Diffusion 10k samples v1.4	CLIP Similarity99.41	119
Text-to-Image Generation	Stable Diffusion 10k samples v1.4 (test)	RMSE Loss0.0356	44
Unconditional Image Generation	CIFAR-10 32x32 EDM (test)	FID2.07	44
Unconditional Image Generation	CIFAR-10 32x32 Score-SDE	FID (NFE=5)48.27	4

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord