Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

About

This paper presents AlphaOne ($\alpha$1), a universal framework for modulating reasoning progress in large reasoning models (LRMs) at test time. $\alpha$1 first introduces $\alpha$ moment, which represents the scaled thinking phase with a universal parameter $\alpha$. Within this scaled pre-$\alpha$ moment phase, it dynamically schedules slow thinking transitions by modeling the insertion of reasoning transition tokens as a Bernoulli stochastic process. After the $\alpha$ moment, $\alpha$1 deterministically terminates slow thinking with the end-of-thinking token, thereby fostering fast reasoning and efficient answer generation. This approach unifies and generalizes existing monotonic scaling methods by enabling flexible and dense slow-to-fast reasoning modulation. Extensive empirical studies on various challenging benchmarks across mathematical, coding, and scientific domains demonstrate $\alpha$1's superior reasoning capability and efficiency. Project page: https://alphaone-project.github.io/

Junyu Zhang, Runpei Dong, Han Wang, Xuying Ning, Haoran Geng, Peihao Li, Xialin He, Yutong Bai, Jitendra Malik, Saurabh Gupta, Huan Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningAIME24
Pass@1 Accuracy78.9
82
Scientific ReasoningGPQA Diamond
Pass@1 Accuracy66.8
54
Mathematical ReasoningAIME 24
Pass@178.9
54
Mathematical ReasoningAIME 25
Pass@1 Accuracy71.1
54
Mathematical ReasoningGSM8K
Pass@1 Accuracy94.5
54
Showing 5 of 5 rows

Other info

Follow for update