Improving Diffusion Language Model Decoding through Joint Search in Generation Order and Token Space

About

Diffusion Language Models (DLMs) offer order-agnostic generation that can explore many possible decoding trajectories. However, current decoding methods commit to a single trajectory, limiting exploration in trajectory space. We introduce Order-Token Search to explore this space through jointly searching over generation order and token values. Its core is a likelihood estimator that scores denoising actions, enabling stable pruning and efficient exploration of diverse trajectories. Across mathematical reasoning and coding benchmarks, Order-Token Search consistently outperforms baselines on GSM8K, MATH500, Countdown, and HumanEval (3.1%, 3.8%, 7.9%, and 6.8% absolute over backbone), matching or surpassing diffu-GRPO post-trained d1-LLaDA. Our work establishes joint search as a key component for advancing decoding in DLMs.

Yangyi Shen, Tianjian Feng, Jiaqi Han, Wen Wang, Tianlang Chen, Chunhua Shen, Jure Leskovec, Stefano Ermon• 2026

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval	--	1043
Mathematical Reasoning	Countdown	Accuracy34.4	252
Sudoku	Sudoku 4x4 (test)	Accuracy (Seq Len 64)13.8	18
Mathematical Reasoning	MATH 500	Accuracy42.4	3

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord