Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Why Gaussian Diffusion Models Fail on Discrete Data?

About

Diffusion models have become a standard approach for generative modeling in continuous domains, yet their application to discrete data remains challenging. We investigate why Gaussian diffusion models with the DDPM solver struggle to sample from discrete distributions that are represented as a mixture of delta-distributions in the continuous space. Using a toy Random Hierarchy Model, we identify a critical sampling interval in which the density of noisified data becomes multimodal. In this regime, DDPM occasionally enters low-density regions between modes producing out-of-distribution inputs for the model and degrading sample quality. We show that existing heuristics, including self-conditioning and a solver we term q-sampling, help alleviate this issue. Furthermore, we demonstrate that combining self-conditioning with switching from DDPM to q-sampling within the critical interval improves generation quality on real data. We validate these findings across conditional and unconditional tasks in multiple domains, including text, programming code, and proteins.

Alexander Shabalin, Simon Elistratov, Viacheslav Meshchaninov, Ildus Sadrtdinov, Dmitry Vetrov• 2026

Related benchmarks

TaskDatasetResultRank
Unconditional Text GenerationOpenWebText
Gen. PPL34.4
100
Abstractive SummarizationXsum
ROUGE-131.9
22
Paraphrase GenerationQQP
BLEU34.13
19
Unconditional Text GenerationROCStories
MAUVE86.8
18
Unconditional Text GenerationWikipedia
Mauve Score90.1
18
Code GenerationConala
CodeBS68.1
4
Machine TranslationIWSLT14
BLEU28.93
4
Protein Sequence GenerationSwissProt subset of UniProt
FD-seq1.045
4
Showing 8 of 8 rows

Other info

Follow for update