Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DUEL: Exact Likelihood for Masked Diffusion via Deterministic Unmasking

About

Masked diffusion models (MDMs) generate text by iteratively selecting positions to unmask and then predicting tokens at those positions. Yet MDMs lack proper likelihood evaluation: the evidence lower bound (ELBO) is not only a loose bound on log-likelihood, but, as we show, is also computed under the training distribution rather than the test-time distribution. We resolve this within our DUEL framework, which unifies leading MDM sampling strategies that employ $\textit{deterministic}$ position selection. We prove that DUEL samplers admit $\textbf{exact likelihood computation under the test-time distribution}$ -- giving MDMs $\textit{proper}$ likelihood, and hence proper perplexity, for the first time. This proper perplexity is the natural analogue of autoregressive perplexity and lets us revisit key questions about MDMs. $\textbf{MDMs are substantially better than previously thought}$: the MDM-autoregressive perplexity gap shrinks by up to $32\%$ on in-domain data and $82\%$ on zero-shot benchmarks. DUEL enables the first principled comparison of fast,parallel samplers across compute budgets -- an analysis impossible with the ELBO and unreliable with generative perplexity -- identifying a strong default method. Finally, oracle search over position orderings reveals MDMs can far surpass autoregressive models -- achieving $36.47$ vs. $52.11$ perplexity on AG News -- demonstrating the ceiling of MDM performance has not yet been reached.

Gilad Turok, Chris De Sa, Volodymyr Kuleshov• 2026

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText
PPL14.5
732
Language ModelingLAMBADA
Perplexity36
150
Language ModelingAG-News
PPL78.91
36
Language ModelingPTB zero-shot--
23
Language ModelingAG News zero-shot--
10
Language ModelingWikitext zero-shot--
3
Showing 6 of 6 rows

Other info

Follow for update