Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Consistent Diffusion Language Models

About

Diffusion language models (DLMs) are an attractive alternative to autoregressive models because they promise sublinear-time, parallel generation, yet practical gains remain elusive as high-quality samples still demand hundreds of refinement steps. In continuous domains, consistency training along the probability-flow ODE is a popular recipe to accelerate diffusion. For discrete diffusion, no analogous sample-space ODE exists, making direct adaptation ill-defined. We argue that the right discrete substitute is the exact posterior bridge, the closed-form conditional law linking any two noise levels, which is available for broad corruptions including masked and uniform diffusion. Building on this observation, we introduce Multi-Path Discrete Consistency (MPDC), a new principle that trains a denoiser to be path-invariant in expectation across these stochastic bridges, and instantiate it as the Consistent Diffusion Language Model (CDLM), a single-stage training framework that does not require an already trained teacher model. Our CDLM objective recovers masked diffusion, continuous consistency models, and progressive or discrete distillation as analytic limits or empirical approximations of one common view. Empirically, CDLM establishes a new state of the art on both conditional and unconditional text-generation, consistently outperforming strong base discrete diffusion models and often even multi-stage distilled baselines across sampling budgets, with the largest gains in the few-step regime. Together, these results position CDLM as a principled and scalable foundation for the next generation of fast, high-fidelity discrete generative modeling.

Hasan Amin, Yuan Gao, Yaser Souri, Subhojit Som, Ming Yin, Rajiv Khanna, Xia Song• 2026

Related benchmarks

TaskDatasetResultRank
Unconditional Text GenerationOpenWebText
Gen. PPL27.1
219
Conditional GenerationOpenWebText
Perplexity27.9
12
Conditional GenerationWikitext103
Perplexity26.9
12
Conditional GenerationLAMBADA
Perplexity38.5
12
Conditional GenerationPTB
Perplexity106.8
12
Conditional Text GenerationOpenWebText
MAUVE Score0.99
9
Conditional Text GenerationWikitext103
MAUVE0.99
9
Conditional Text GenerationLAMBADA
MAUVE0.99
9
Conditional Text GenerationPTB
MAUVE Score0.07
9
Showing 9 of 9 rows

Other info

Follow for update