SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning

About

Recently, large reasoning models demonstrate exceptional performance on various tasks. However, reasoning models always consume excessive tokens even for simple queries, leading to resource waste and prolonged user latency. To address this challenge, we propose SelfBudgeter - a self-adaptive reasoning strategy for efficient and controllable reasoning. Specifically, we first train the model to self-estimate the required reasoning budget based on the query. We then introduce budget-guided GRPO for reinforcement learning, which effectively maintains accuracy while reducing output length. Experimental results demonstrate that SelfBudgeter dynamically allocates budgets according to problem complexity, achieving an average response length compression of 61% on math reasoning tasks while maintaining accuracy. Furthermore, SelfBudgeter allows users to see how long generation will take and decide whether to continue or stop. Additionally, users can directly control the reasoning length by setting token budgets upfront.

Zheng Li, Qingxiu Dong, Jingyuan Ma, Di Zhang, Kai Jia, Zhifang Sui• 2025

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	AIME 2024	Accuracy25	525
Mathematical Reasoning	GSM8K	--	499
Mathematical Reasoning	Minerva Math	Accuracy53.4	251
Mathematical Problem Solving	MATH500	Accuracy86.87	96
Mathematical Reasoning	Minerva	Pass@125.5	22
Mathematical Reasoning	AIME24	Pass@113.42	18
Mathematical Reasoning	MATH500	Pass@1 Rate53.47	18
Math problem solving	AIME 2025 (test)	Accuracy30	9
Math problem solving	GSM8K (test)	Accuracy90.3	9
Complex Reasoning	SCoRE (test)	Accuracy16.26	5

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord