Sampling-Aware Quantization for Diffusion Models
About
Diffusion models have recently emerged as the dominant approach in visual generation tasks. However, the lengthy denoising chains and the computationally intensive noise estimation networks hinder their applicability in low-latency and resource-limited environments. Previous research has endeavored to address these limitations in a decoupled manner, utilizing either advanced samplers or efficient model quantization techniques. In this study, we uncover that quantization-induced noise disrupts directional estimation at each sampling step, further distorting the precise directional estimations of higher-order samplers when solving the sampling equations through discretized numerical methods, thereby altering the optimal sampling trajectory. To attain dual acceleration with high fidelity, we propose a sampling-aware quantization strategy, wherein a Mixed-Order Trajectory Alignment technique is devised to impose a more stringent constraint on the error bounds at each sampling step, facilitating a more linear probability flow. Extensive experiments on sparse-step fast sampling across multiple datasets demonstrate that our approach preserves the rapid convergence characteristics of high-speed samplers while maintaining superior generation quality. Code is publicly available at: https://github.com/TaylorJocelyn/Sampling-aware-Quantization.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Class-conditional Image Generation | ImageNet 256x256 (val) | Inception Score (IS)242 | 493 | |
| Image Generation | LSUN Bedroom 256x256 (test) | FID8.79 | 81 | |
| Unconditional Image Generation | LSUN Churches 256 x 256 | FID10.07 | 26 | |
| Text-guided Image Generation | MS-COCO 512 x 512 v1.4 (val) | FID13.1 | 6 |