Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Beyond the Target: From Imitation to Collaboration in Speculative Decoding

About

Speculative decoding (SPD) accelerates large language model (LLM) inference by letting a smaller draft model propose multiple future tokens that are verified in parallel by a larger target model. The dominant SPD paradigm treats the target model as the sole reliable teacher, accepting a draft token only when it exactly matches the target prediction. This design implicitly assumes that the target is always the better choice at every position. In practice, this assumption does not hold. Although the draft is the weaker model overall, it is not uniformly inferior at the token level. In a meaningful fraction of cases where draft and target disagree, the draft's choice is the one that leads to the correct final answer. Inspired by this, we introduce \textbf{Collaborative Speculative Decoding (CoSpec)}, a generalization of SPD that no longer treats the target model as the sole token-level authority. CoSpec trains an arbitration policy via reinforcement learning to decide whether to accept tokens from the draft or target model, selectively accepting draft tokens at mismatches when doing so is likely to yield a correct final answer. Experimental results show that CoSpec maintains substantial speedups while surpassing target-only performance. By shifting the emphasis from imitation to collaboration, CoSpec suggests a new perspective on speculative decoding.

Jinze Li, Yixing Xu, Guanchen Li, Jinfeng Xu, Shuo Yang, Yang Zhang, Xuanwu Yin, Dong Li, Edith C.H. Ngai, Emad Barsoum• 2026

Related benchmarks

TaskDatasetResultRank
Code GenerationHumanEval
Inference Speed (× Baseline)4.15
26
Code GenerationMBPP
Speedup3.97
26
General Reasoning and CodingMean GSM8K, HumanEval, MBPP
Speed4
26
Mathematical ReasoningGSM8K
Speed (×)3.88
26
DialogMT-Bench
Score9.43
2
Factual QATriviaQA
Score89.2
2
Hard ReasoningGPQA
Score55.4
2
Mathematical ReasoningMATH 500
Score66.2
2
Showing 8 of 8 rows

Other info

Follow for update