Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Advancing Block Diffusion Language Models for Test-Time Scaling

About

Recent advances in block diffusion language models have demonstrated competitive performance and strong scalability on reasoning tasks. However, existing BDLMs have limited exploration under the test-time scaling setting and face more severe decoding challenges in long Chain-of-Thought reasoning, particularly in balancing the decoding speed and effectiveness. In this work, we propose a unified framework for test-time scaling in BDLMs that introduces adaptivity in both decoding and block-wise generation. At the decoding level, we propose Bounded Adaptive Confidence Decoding (BACD), a difficulty-aware sampling strategy that dynamically adjusts denoising based on model confidence, accelerating inference while controlling error accumulation. Beyond step-wise adaptivity, we introduce Think Coarse, Critic Fine (TCCF), a test-time scaling paradigm that allocates large block sizes to exploratory reasoning and smaller block sizes to refinement, achieving an effective efficiency-effectiveness balance. To enable efficient and effective decoding with a large block size, we adopt Progressive Block Size Extension, which mitigates performance degradation when scaling block sizes. Extensive experiments show that applying BACD and TCCF to TDAR-8B yields significant improvements over strong baselines such as TraDo-8B (2.26x speedup, +11.2 points on AIME24). These results mark an important step toward unlocking the potential of BDLMs for test-time scaling in complex reasoning tasks.

Yi Lu, Deyang Kong, Jianing Wang, Linsen Guo, Xue Wang, Qi Guo, Tao Gui, Xuanjing Huang, Wei Ye, Shikun Zhang, Wei Wang• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningMATH500
Accuracy (ACC)84
133
Code ReasoningLiveCodeBench
Accuracy42.6
46
Mathematical ReasoningAMC23
AVG@880
25
Mathematical ReasoningAIME24
TPF5.07
14
Mathematical ReasoningAIME25
TPF473
14
Reasoning Performance (Aggregate)AVG
TPF337
14
General ReasoningGPQA
TPF149
14
Showing 7 of 7 rows

Other info

Follow for update