Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mid-Think: Training-Free Intermediate-Budget Reasoning via Token-Level Triggers

About

Hybrid reasoning language models are commonly controlled through high-level Think/No-think instructions to regulate reasoning behavior, yet we found that such mode switching is largely driven by a small set of trigger tokens rather than the instructions themselves. Through attention analysis and controlled prompting experiments, we show that a leading ``Okay'' token induces reasoning behavior, while the newline pattern following ``</think>'' suppresses it. Based on this observation, we propose Mid-Think, a simple training-free prompting format that combines these triggers to achieve intermediate-budget reasoning, consistently outperforming fixed-token and prompt-based baselines in terms of the accuracy-length trade-off. Furthermore, applying Mid-Think to RL training after SFT reduces training time by approximately 15% while improving final performance of Qwen3-8B on AIME from 69.8% to 72.4% and on GPQA from 58.5% to 61.1%, demonstrating its effectiveness for both inference-time control and RL-based reasoning training.

Wang Yang, Debargha Ganguly, Xinpeng Li, Chaoda Song, Shouren Wang, Vikash Singh, Vipin Chaudhary, Xiaotian Han• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningMATH500 (test)
Accuracy94.1
381
Mathematical ReasoningAIME No-think (test)
Accuracy62.9
8
Mathematical ReasoningAIME Think (test)
Accuracy72.4
8
Question AnsweringGPQA No-think (test)
Accuracy47.8
8
Question AnsweringGPQA Think (test)
Accuracy61.1
8
Showing 5 of 5 rows

Other info

Follow for update