Dynamic Early Exit in Reasoning Models

About

Recent advances in large reasoning language models (LRLMs) rely on test-time scaling, which extends long chain-of-thought (CoT) generation to solve complex tasks. However, overthinking in long CoT not only slows down the efficiency of problem solving, but also risks accuracy loss due to the extremely detailed or redundant reasoning steps. We propose a simple yet effective method that allows LLMs to self-truncate CoT sequences by early exit during generation. Instead of relying on fixed heuristics, the proposed method monitors model behavior at potential reasoning transition points and dynamically terminates the next reasoning chain's generation when the model exhibits high confidence in a trial answer. Our method requires no additional training and can be seamlessly integrated into existing o1-like reasoning LLMs. Experiments on 10 reasoning benchmarks (e.g., GSM8K, MATH-500, AMC, GPQA, AIME and LiveCodeBench) show that the proposed method is consistently effective on 11 cutting-edge reasoning LLMs of varying series and sizes, reducing the length of CoT sequences by an average of 19.1% to 80.1% while improving accuracy by 0.3% to 5.0%.

Chenxu Yang, Qingyi Si, Yongjie Duan, Zheliang Zhu, Chenyu Zhu, Qiaowei Li, Minghui Chen, Zheng Lin, Weiping Wang• 2025

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval	--	1043
Mathematical Reasoning	MATH500 (test)	Accuracy90.4	895
Mathematical Reasoning	GSM8K	Accuracy94.39	499
Mathematical Reasoning	AIME 2025	Accuracy50	311
Mathematical Reasoning	AIME 2024	Accuracy76.7	220
Mathematical Reasoning	AMC	Accuracy (ACC)97.5	215
Mathematical Reasoning	MATH Hard	Accuracy82.5	198
Mathematical Reasoning	AIME 24	Accuracy56.7	113
Mathematical Reasoning	AMC 23	Accuracy94.7	113
Math Reasoning	AMC23	Pass@1 Accuracy95	99

Showing 10 of 128 rows

...

Other info

Follow for update

@wizwand_team Discord