CoT-Valve: Length-Compressible Chain-of-Thought Tuning

About

Chain-of-Thought significantly enhances a model's reasoning capability, but it also comes with a considerable increase in inference costs due to long chains. With the observation that the reasoning path can be easily compressed under easy tasks but struggle on hard tasks, we explore the feasibility of elastically controlling the length of reasoning paths with only one model, thereby reducing the inference overhead of reasoning models dynamically based on task difficulty. We introduce a new tuning and inference strategy named CoT-Valve, designed to allow models to generate reasoning chains of varying lengths. To achieve this, we propose to identify a direction in the parameter space that, when manipulated, can effectively control the length of generated CoT. Moreover, we show that this property is valuable for compressing the reasoning chain. We construct datasets with chains from long to short for the same questions and explore two enhanced strategies for CoT-Valve: (1) a precise length-compressible CoT tuning method, and (2) a progressive chain length compression approach. Our experiments show that CoT-Valve successfully enables controllability and compressibility of the chain and shows better performance than the prompt-based control. We applied this method to QwQ-32B-Preview, reducing reasoning chains on GSM8K from 741 to 225 tokens with a minor performance drop (95.07% to 94.92%) and on AIME from 6827 to 4629 tokens, with only one additional incorrect answer.

Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, Xinchao Wang• 2025

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	GSM8K	Accuracy90.5	499
Mathematical Reasoning	MATH 500	Average Tokens2.38e+3	104
Mathematical Reasoning	Math Benchmarks Aggregate	Accuracy (Avg)63.83	62
Mathematical Reasoning	GSM8K (test)	Accuracy96.1	33
Mathematical Reasoning	AIME 2024	ACC-1.50e+3	26
Medical Question Answering	Medical Benchmarks (MedQA, MedMCQA, BULLET) (test)	MedQA Accuracy0.55	18
Mathematical Reasoning	MATH	Accuracy80.33	18
Mathematical Reasoning	AMC23	Accuracy65	18
Mathematical Reasoning	AIME 24	Accuracy20	18
Mathematical Reasoning	MATH500	Accuracy-10.6	14

Showing 10 of 24 rows

Other info

Follow for update

@wizwand_team Discord