The Path Matters: Learning a Token-Commitment Policy for Diffusion Language Models

About

Diffusion large language models promise faster generation by refining many token positions in parallel, but this parallelism introduces a hidden control problem: which proposed tokens should be transferred into the partially decoded sequence at each step? We refer to this decision as token commitment. Existing frozen-generator decoders largely rely on hand-designed confidence rules or block-specific acceptance filters. We argue that token commitment can instead be learned as a reusable trace-state policy. We introduce TraceLock, a lightweight plug-in controller that instantiates this policy for a frozen diffusion language model. Since oracle commitment times are unavailable, TraceLock derives self-supervision from future stability: at decoding step t, a proposed token for position i is labeled stable if it matches the final token at position i after the full decoding trace completes. The controller scores variable-length trace states and decides which active token proposals should be committed to the partially decoded sequence. Once trained for a given frozen backbone, the controller can be deployed across local-window widths, generation lengths, and step budgets without retraining or per-setting calibration. Experiments on question answering, mathematical reasoning, and code generation show that TraceLock improves the quality-step tradeoff over heuristic and learned baselines, with particularly stable behavior under cross-setting deployment. Diagnostic analyses show that its decisions are not reducible to scalar confidence, suggesting that frozen diffusion language models expose a learnable space of commitment trajectories beyond confidence-based decoding. Code is available at https://github.com/BobSun98/TraceLock.

Bohang Sun, Max Zhu, Francesco Caso, Jindong Gu, Junchi Yu, Philip Torr, Pietro Li\`o, Jialin Yu• 2026

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	GSM8K	Accuracy (Acc)80	352
Mathematical Reasoning	MATH	Accuracy (%)80.8	52
Code Generation	Coding	Pass@156.1	40
Question Answering	QA	Average Rank1.73	40
Generation	Math Domain	Average Generation Time (s)2.96	40
Generation	QA domain	Average Generation Time (s)2.87	40
Generation	Coding domain	Average Wall-Clock Time (s)5.03	40
Question Answering	Natural Questions	Average Rank2.6	10

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord