Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SpecTr-GBV: Multi-Draft Block Verification Accelerating Speculative Decoding

About

Autoregressive language models suffer from high inference latency due to their sequential decoding nature. Speculative decoding (SD) mitigates this by employing a lightweight draft model to propose candidate tokens, which are selectively verified by a larger target model. While existing methods either adopt multi-draft strategies to increase acceptance rates or block verification techniques to jointly verify multiple tokens, they remain limited by treating these improvements in isolation. In this work, we propose SpecTr-GBV, a novel SD method that unifies multi-draft and greedy block verification (GBV) into a single framework. By formulating the verification step as an optimal transport problem over draft and target token blocks, SpecTr-GBV improves both theoretical efficiency and empirical performance. We theoretically prove that SpecTr-GBV achieves the optimal expected acceptance length physically attainable within the framework of i.i.d. draft generation, and this bound improves as the number of drafts increases. Empirically, we evaluate SpecTr-GBV across five datasets and four baselines. Our method achieves superior speedup and significantly higher block efficiency while preserving output quality. In addition, we perform comprehensive ablation studies to evaluate the impact of various hyperparameters in the model.

Yijun Lin, Jinhao Sheng, Qingyue Cai, Feng Zhou• 2026

Related benchmarks

TaskDatasetResultRank
Multilingual Mathematical ReasoningMGSM (test)--
109
Instruction FollowingAlpaca (test)
SR Score1.6
21
Language ModelingLM1B (test)
Block Efficiency8.94
15
Python ProgrammingHumanEval (test)
BE Score10.45
10
Speculative DecodingHumanEval (test)
BE8.02
10
Speculative DecodingGSM8K (test)
BE Score7.95
10
Speculative DecodingMGSM (test)
BE7.39
10
Speculative DecodingLM1B (test)
BE7.88
10
Speculative DecodingAlpaca (test)
BE Score7.7
10
grade-school mathGSM8K (test)
BE Score6.68
10
Showing 10 of 10 rows

Other info

Follow for update