Diffusion Language Model Inference with Monte Carlo Tree Search
About
Diffusion language models (DLMs) have recently emerged as a compelling alternative to autoregressive generation, offering parallel generation and improved global coherence. During inference, DLMs generate text by iteratively denoising masked sequences in parallel; however, determining which positions to unmask and which tokens to commit forms a large combinatorial search problem. Existing inference methods approximate this search using heuristics, which often yield suboptimal decoding paths; other approaches instead rely on additional training to guide token selection. To introduce a principled search mechanism for DLMs inference, we introduce MEDAL, an inference-time scaling framework that integrates Monte Carlo Tree SEarch initialization for Diffusion LAnguage Model inference. We employ Monte Carlo Tree Search at the initialization stage to explore promising unmasking trajectories, providing a robust starting point for subsequent refinement. This design enables efficient inference-time scaling, allowing generation quality to improve as the search budget increases, without additional training. Across multiple benchmarks, MEDAL achieves up to 22.0% improvement over existing inference strategies, establishing a new paradigm for search-based inference in DLMs.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | GSM8K | Accuracy69 | 983 | |
| Code Generation | HumanEval | Pass@151 | 850 | |
| Language Understanding | MMLU | Accuracy46.5 | 756 | |
| General Knowledge | MMLU | MMLU General Knowledge Accuracy44.5 | 170 | |
| Science Question Answering | ARC-C | Accuracy88.5 | 127 | |
| Reading Comprehension | DROP | DROP Accuracy73 | 103 | |
| Reading Comprehension | DROP | F1 Score71.1 | 55 | |
| Mathematical Reasoning | Countdown | Accuracy19.2 | 36 |