Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation

About

Recent advances in diffusion models bring state-of-the-art performance on image generation tasks. However, empirical results from previous research in diffusion models imply an inverse correlation between density estimation and sample generation performances. This paper investigates with sufficient empirical evidence that such inverse correlation happens because density estimation is significantly contributed by small diffusion time, whereas sample generation mainly depends on large diffusion time. However, training a score network well across the entire diffusion time is demanding because the loss scale is significantly imbalanced at each diffusion time. For successful training, therefore, we introduce Soft Truncation, a universally applicable training technique for diffusion models, that softens the fixed and static truncation hyperparameter into a random variable. In experiments, Soft Truncation achieves state-of-the-art performance on CIFAR-10, CelebA, CelebA-HQ 256x256, and STL-10 datasets.

Dongjun Kim, Seungjae Shin, Kyungwoo Song, Wanmo Kang, Il-Chul Moon• 2021

Related benchmarks

TaskDatasetResultRank
Unconditional Image GenerationCIFAR-10 (test)
FID2.47
216
Image GenerationCelebA 64 x 64 (test)
FID1.9
203
Unconditional Image GenerationCIFAR-10
FID2.47
171
Image GenerationCIFAR10 32x32 (test)
FID2.33
154
Unconditional GenerationCIFAR-10 (test)
FID2.33
102
Unconditional Image GenerationCelebA unconditional 64 x 64
FID1.9
95
Image GenerationCIFAR-10 (train/test)--
78
Image GenerationCelebA-HQ 256x256
FID7.16
51
Density EstimationCIFAR-10
bpd3.04
40
Image GenerationCelebA-HQ 256x256 (test)
FID7.16
34
Showing 10 of 28 rows

Other info

Code

Follow for update