Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Structured Denoising Diffusion Models in Discrete State-Spaces

About

Denoising diffusion probabilistic models (DDPMs) (Ho et al. 2020) have shown impressive results on image and waveform generation in continuous state spaces. Here, we introduce Discrete Denoising Diffusion Probabilistic Models (D3PMs), diffusion-like generative models for discrete data that generalize the multinomial diffusion model of Hoogeboom et al. 2021, by going beyond corruption processes with uniform transition probabilities. This includes corruption with transition matrices that mimic Gaussian kernels in continuous space, matrices based on nearest neighbors in embedding space, and matrices that introduce absorbing states. The third allows us to draw a connection between diffusion models and autoregressive and mask-based generative models. We show that the choice of transition matrix is an important design decision that leads to improved results in image and text domains. We also introduce a new loss function that combines the variational lower bound with an auxiliary cross entropy loss. For text, this model class achieves strong results on character-level text generation while scaling to large vocabularies on LM1B. On the image dataset CIFAR-10, our models approach the sample quality and exceed the log-likelihood of the continuous-space DDPM model.

Jacob Austin, Daniel D. Johnson, Jonathan Ho, Daniel Tarlow, Rianne van den Berg• 2021

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText2
Perplexity77.28
2839
Mathematical ReasoningGSM8K
Accuracy30.4
1362
Code GenerationHumanEval--
1036
Language ModelingPTB
Perplexity200.8
1034
Question AnsweringARC Easy--
389
Code GenerationHumanEval+--
383
Unconditional Image GenerationCIFAR-10 (test)
FID7.34
223
Text-to-Image GenerationGenEval--
218
Language ModelingWikiText-103
PPL75.16
189
Unconditional Image GenerationCIFAR-10 unconditional
FID7.34
165
Showing 10 of 83 rows
...

Other info

Follow for update