Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Improving Diffusion Language Model Decoding through Joint Search in Generation Order and Token Space

About

Diffusion Language Models (DLMs) offer order-agnostic generation that can explore many possible decoding trajectories. However, current decoding methods commit to a single trajectory, limiting exploration in trajectory space. We introduce Order-Token Search to explore this space through jointly searching over generation order and token values. Its core is a likelihood estimator that scores denoising actions, enabling stable pruning and efficient exploration of diverse trajectories. Across mathematical reasoning and coding benchmarks, Order-Token Search consistently outperforms baselines on GSM8K, MATH500, Countdown, and HumanEval (3.1%, 3.8%, 7.9%, and 6.8% absolute over backbone), matching or surpassing diffu-GRPO post-trained d1-LLaDA. Our work establishes joint search as a key component for advancing decoding in DLMs.

Yangyi Shen, Tianjian Feng, Jiaqi Han, Wen Wang, Tianlang Chen, Chunhua Shen, Jure Leskovec, Stefano Ermon• 2026

Related benchmarks

TaskDatasetResultRank
Code GenerationHumanEval--
1012
Mathematical ReasoningCountdown
Accuracy34.4
126
SudokuSudoku 4x4 (test)
Accuracy (Seq Len 64)13.8
18
Mathematical ReasoningMATH 500
Accuracy42.4
3
Showing 4 of 4 rows

Other info

Follow for update