TRACT: Denoising Diffusion Models with Transitive Closure Time-Distillation
About
Denoising Diffusion models have demonstrated their proficiency for generative sampling. However, generating good samples often requires many iterations. Consequently, techniques such as binary time-distillation (BTD) have been proposed to reduce the number of network calls for a fixed architecture. In this paper, we introduce TRAnsitive Closure Time-distillation (TRACT), a new method that extends BTD. For single step diffusion,TRACT improves FID by up to 2.4x on the same architecture, and achieves new single-step Denoising Diffusion Implicit Models (DDIM) state-of-the-art FID (7.4 for ImageNet64, 3.8 for CIFAR10). Finally we tease apart the method through extended ablations. The PyTorch implementation will be released soon.
David Berthelot, Arnaud Autef, Jierui Lin, Dian Ang Yap, Shuangfei Zhai, Siyuan Hu, Daniel Zheng, Walter Talbott, Eric Gu• 2023
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Unconditional Image Generation | CIFAR-10 (test) | FID3.32 | 216 | |
| Unconditional Image Generation | CIFAR-10 | FID3.78 | 171 | |
| Unconditional Image Generation | CIFAR-10 unconditional | FID3.32 | 159 | |
| Image Generation | ImageNet 64x64 resolution (test) | FID4.97 | 150 | |
| Class-conditional Image Generation | ImageNet 64x64 | FID2.41 | 126 | |
| Image Generation | CIFAR-10 | FID3.78 | 95 | |
| Unconditional Image Generation | CIFAR-10 32x32 (test) | FID3.78 | 94 | |
| Class-conditional Image Generation | ImageNet 64x64 (test) | FID4.97 | 86 | |
| Image Generation | ImageNet 64x64 (val) | FID7.43 | 48 | |
| Image Generation | CIFAR-10 unconditional (test) | FID3.32 | 39 |
Showing 10 of 11 rows