Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Compensation Sampling for Improved Convergence in Diffusion Models

About

Diffusion models achieve remarkable quality in image generation, but at a cost. Iterative denoising requires many time steps to produce high fidelity images. We argue that the denoising process is crucially limited by an accumulation of the reconstruction error due to an initial inaccurate reconstruction of the target data. This leads to lower quality outputs, and slower convergence. To address this issue, we propose compensation sampling to guide the generation towards the target domain. We introduce a compensation term, implemented as a U-Net, which adds negligible computation overhead during training and, optionally, inference. Our approach is flexible and we demonstrate its application in unconditional generation, face inpainting, and face de-occlusion using benchmark datasets CIFAR-10, CelebA, CelebA-HQ, FFHQ-256, and FSG. Our approach consistently yields state-of-the-art results in terms of image quality, while accelerating the denoising process to converge during training by up to an order of magnitude.

Hui Lu, Albert ali Salah, Ronald Poppe• 2023

Related benchmarks

TaskDatasetResultRank
Unconditional Image GenerationCIFAR-10 (test)--
216
Unconditional Image GenerationCelebA unconditional 64 x 64
FID1.21
95
Unconditional Image GenerationFFHQ 256x256
FID2.57
64
Face inpainting (Half)CelebA-HQ-256 (test)
LPIPS0.272
12
Face de-occlusionFSG
PSNR31.3842
8
Face inpainting (Completion)CelebA-HQ-256 (test)
LPIPS0.259
8
Face inpainting (Expand)CelebA-HQ-256 (test)
LPIPS0.372
8
Face inpainting (Medium Line)CelebA-HQ-256 (test)
LPIPS0.064
8
Face inpainting (Thick Line)CelebA-HQ-256 (test)
LPIPS0.079
8
Showing 9 of 9 rows

Other info

Code

Follow for update