Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models

About

Recent advancements in slow thinking reasoning models have shown exceptional performance in complex reasoning tasks. However, these models often exhibit overthinking (generating redundant reasoning steps for simple problems), leading to excessive computational resource usage. While current mitigation strategies uniformly reduce reasoning tokens, they risk degrading performance on challenging tasks that require extended reasoning. This paper introduces Difficulty-Adaptive Slow Thinking (DAST), a novel framework that enables models to autonomously adjust the length of Chain-of-Thought (CoT) based on problem difficulty. We first propose a Token Length Budget (TLB) metric to quantify difficulty, then leverage budget-aware reward shaping and budget preference optimization to implement DAST. DAST penalizes overlong responses for simple tasks while incentivizing sufficient reasoning for complex problems. Experiments on diverse datasets and model scales demonstrate that DAST effectively mitigates overthinking (reducing token usage by over 30\% on average) while preserving reasoning accuracy on complex problems. Our codes and models are available at https://github.com/AnonymousUser0520/AnonymousRepo01.

Yi Shen, Jian Zhang, Jieyun Huang, Shuming Shi, Wenjing Zhang, Jiangze Yan, Ning Wang, Kai Wang, Zhaoxiang Liu, Shiguo Lian• 2025

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningWinoGrande--
1085
Mathematical ReasoningAIME 24
Accuracy50.83
154
Mathematical ReasoningMinerva--
138
Mathematical ReasoningAIME24
Pass@1 Accuracy77.8
82
Mathematical ReasoningOlympiadBench
Accuracy55.34
81
Mathematical ReasoningMATH 500--
76
Mathematical ReasoningAIME 24
Pass@177.8
54
Mathematical ReasoningAIME 25
Pass@1 Accuracy68.9
54
Scientific ReasoningGPQA Diamond
Pass@1 Accuracy65.7
54
Mathematical ReasoningGSM8K
Pass@1 Accuracy94.2
54
Showing 10 of 27 rows

Other info

Follow for update