Any-Order Flexible Length Masked Diffusion

About

Masked diffusion models (MDMs) have recently emerged as a promising alternative to autoregressive models over discrete domains. MDMs generate sequences in an any-order, parallel fashion, enabling fast inference and strong performance on non-causal tasks. However, a crucial limitation is that they do not support token insertions and are thus limited to fixed-length generations. To this end, we introduce Flexible Masked Diffusion Models (FlexMDMs), a discrete diffusion paradigm that simultaneously can model sequences of flexible length while provably retaining MDMs' flexibility of any-order inference. Grounded in an extension of the stochastic interpolant framework, FlexMDMs generate sequences by inserting mask tokens and unmasking them. Empirically, we show that FlexMDMs match MDMs in perplexity while modeling length statistics with much higher fidelity. On a synthetic maze planning task, they achieve $\approx 60 \%$ higher success rate than MDM baselines. Finally, we show pretrained MDMs can easily be retrofitted into FlexMDMs: on 16 H100s, it takes only three days to fine-tune LLaDA-8B into a FlexMDM, achieving superior performance on math (GSM8K, $58\% \to 67\%$) and code infilling performance ($52\% \to 65\%$).

Jaeyeon Kim, Lee Cheuk-Kit, Carles Domingo-Enrich, Yilun Du, Sham Kakade, Timothy Ngotiaoco, Sitan Chen, Michael Albergo• 2025

Related benchmarks

Task	Dataset	Result
De novo small molecule generation	SAFE (test)	Validity98.9	22
Code Infilling	HumanEval Infilling	--	19
Star graph traversal	Star graphs medium	Exact Match Acc91.3	9
Star graph traversal	Star graphs hard	Exact Match Rate7.4	9
Maze Planning	Imperfect Maze Planning Hard	Accuracy84.2	6
Maze Planning	Braided Maze Planning Easy	Accuracy86.9	6
Maze Planning	Braided Maze Planning Medium	Accuracy0.867	6
Maze Planning	Braided Maze Planning Hard	Accuracy89.5	6
Maze Planning	Imperfect Maze Planning Medium	Accuracy83.1	6
Maze Planning	Imperfect Maze Planning Easy	Accuracy78.5	6

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord