Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Distilling Parallel Gradients for Fast ODE Solvers of Diffusion Models

About

Diffusion models (DMs) have achieved state-of-the-art generative performance but suffer from high sampling latency due to their sequential denoising nature. Existing solver-based acceleration methods often face image quality degradation under a low-latency budget. In this paper, we propose the Ensemble Parallel Direction solver (dubbed as \ours), a novel ODE solver that mitigates truncation errors by incorporating multiple parallel gradient evaluations in each ODE step. Importantly, since the additional gradient computations are independent, they can be fully parallelized, preserving low-latency sampling. Our method optimizes a small set of learnable parameters in a distillation fashion, ensuring minimal training overhead. In addition, our method can serve as a plugin to improve existing ODE samplers. Extensive experiments on various image synthesis benchmarks demonstrate the effectiveness of our \ours~in achieving high-quality and low-latency sampling. For example, at the same latency level of 5 NFE, EPD achieves an FID of 4.47 on CIFAR-10, 7.97 on FFHQ, 8.17 on ImageNet, and 8.26 on LSUN Bedroom, surpassing existing learning-based solvers by a significant margin. Codes are available in https://github.com/BeierZhu/EPD.

Beier Zhu, Ruoyu Wang, Tong Zhao, Hanwang Zhang, Chi Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Image GenerationCIFAR-10
FID4.33
203
Text-to-Image GenerationMS-COCO (val)
FID13.14
202
Class-conditional Image GenerationImageNet 64x64
FID5.26
156
Image GenerationCIFAR-10 32x32
FID2.88
147
Unconditional Image GenerationCIFAR-10 32x32 (test)
FID2.42
137
Image GenerationLSUN bedroom
FID7.52
105
Image GenerationImageNet 64
FID6.35
100
Conditional Image GenerationImageNet 64x64 (val)
FID4.02
87
Image GenerationFFHQ 64x64
FID5.11
76
Image GenerationFFHQ
FID7.84
70
Showing 10 of 19 rows

Other info

Follow for update