Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning

About

The practical deployment of diffusion models is still hindered by the high memory and computational overhead. Although quantization paves a way for model compression and acceleration, existing methods face challenges in achieving low-bit quantization efficiently. In this paper, we identify imbalanced activation distributions as a primary source of quantization difficulty, and propose to adjust these distributions through weight finetuning to be more quantization-friendly. We provide both theoretical and empirical evidence supporting finetuning as a practical and reliable solution. Building on this approach, we further distinguish two critical types of quantized layers: those responsible for retaining essential temporal information and those particularly sensitive to bit-width reduction. By selectively finetuning these layers under both local and global supervision, we mitigate performance degradation while enhancing quantization efficiency. Our method demonstrates its efficacy across three high-resolution image generation tasks, obtaining state-of-the-art performance across multiple bit-width settings.

Haoxuan Wang, Yuzhang Shang, Zhihang Yuan, Junyi Wu, Junchi Yan, Yan Yan• 2024

Related benchmarks

TaskDatasetResultRank
Class-conditional Image GenerationImageNet 256x256 (val)
FID8.45
427
Image Super-resolutionDRealSR
MANIQA0.3541
130
Image GenerationLSUN Bedroom 256x256 (test)
FID10.1
73
Real-world Image Super-ResolutionRealLQ250
MUSIQ0.3687
45
Conditional Image GenerationImageNet 256x256
FID5.98
42
Real-world Image Super-ResolutionDRealSR
LPIPS0.8322
35
Real-world Image Super-ResolutionRealLR200
MUSIQ37.21
34
Real-world Image Super-ResolutionRealSR
LPIPS0.8663
31
Conditional Image GenerationImageNet 256x256 CFG=1.5 1K (val)
IS4.87
18
Unconditional GenerationLSUN Church 256x256 (test)
FID6.83
11
Showing 10 of 10 rows

Other info

Follow for update