Thinkless: LLM Learns When to Think

About

Reasoning Language Models, capable of extended chain-of-thought reasoning, have demonstrated remarkable performance on tasks requiring complex logical inference. However, applying elaborate reasoning for all queries often results in substantial computational inefficiencies, particularly when many problems admit straightforward solutions. This motivates an open question: Can LLMs learn when to think? To answer this, we propose Thinkless, a learnable framework that empowers an LLM to adaptively select between short-form and long-form reasoning, based on both task complexity and the model's ability. Thinkless is trained under a reinforcement learning paradigm and employs two control tokens, <short> for concise responses and <think> for detailed reasoning. At the core of our method is a Decoupled Group Relative Policy Optimization (DeGRPO) algorithm, which decomposes the learning objective of hybrid reasoning into two components: (1) a control token loss that governs the selection of the reasoning mode, and (2) a response loss that improves the accuracy of the generated answers. This decoupled formulation enables fine-grained control over the contributions of each objective, stabilizing training and effectively preventing collapse observed in vanilla GRPO. Empirically, on several benchmarks such as Minerva Algebra, MATH-500, and GSM8K, Thinkless is able to reduce the usage of long-chain thinking by 50% - 90%, significantly improving the efficiency of Reasoning Language Models. The code is available at https://github.com/VainF/Thinkless

Gongfan Fang, Xinyin Ma, Xinchao Wang• 2025

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	GSM8K	--	499
Mathematical Reasoning	MATH500	Accuracy73.5	104
Mathematical Reasoning	MATH 500	Average Tokens2.56e+3	104
General Reasoning	GPQA-Diamond & MMLU-Pro	Accuracy31.95	35
General Reasoning	GPQA Diamond	Accuracy32.82	31
Mathematical Reasoning	GSM8K	Tokens356	30
Mathematical Reasoning	Minerva	Pass@125.8	22
Mathematical Reasoning	AIME24	Pass@116.08	18
Mathematical Reasoning	MATH500	Pass@1 Rate54.14	18
Mathematical Reasoning	MATH 500	Tokens Used888	17

Showing 10 of 20 rows

Other info

Follow for update

@wizwand_team Discord