Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Dynamic Early Exit in Reasoning Models

About

Recent advances in large reasoning language models (LRLMs) rely on test-time scaling, which extends long chain-of-thought (CoT) generation to solve complex tasks. However, overthinking in long CoT not only slows down the efficiency of problem solving, but also risks accuracy loss due to the extremely detailed or redundant reasoning steps. We propose a simple yet effective method that allows LLMs to self-truncate CoT sequences by early exit during generation. Instead of relying on fixed heuristics, the proposed method monitors model behavior at potential reasoning transition points and dynamically terminates the next reasoning chain's generation when the model exhibits high confidence in a trial answer. Our method requires no additional training and can be seamlessly integrated into existing o1-like reasoning LLMs. Experiments on 10 reasoning benchmarks (e.g., GSM8K, MATH-500, AMC, GPQA, AIME and LiveCodeBench) show that the proposed method is consistently effective on 11 cutting-edge reasoning LLMs of varying series and sizes, reducing the length of CoT sequences by an average of 19.1% to 80.1% while improving accuracy by 0.3% to 5.0%.

Chenxu Yang, Qingyi Si, Yongjie Duan, Zheliang Zhu, Chenyu Zhu, Qiaowei Li, Minghui Chen, Zheng Lin, Weiping Wang• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningMATH500 (test)
Accuracy90.4
381
Mathematical ReasoningGSM8K
Accuracy94.39
351
Mathematical ReasoningAIME 24
Accuracy56.7
113
Mathematical ReasoningMATH500
Accuracy89
57
Mathematical ReasoningMATH 500
Acc95.4
40
Mathematical ReasoningAMC 2023
Accuracy90
32
Mathematical ReasoningAIME 2024
Accuracy66.67
32
Commonsense ReasoningCommonsenseQA Non-Math
Accuracy84.93
32
Mathematical ReasoningAMC 23
Acc90.8
28
Science Question AnsweringGPQA D
Accuracy59.3
28
Showing 10 of 34 rows

Other info

Follow for update