Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MUR: Momentum Uncertainty guided Reasoning for Large Language Models

About

Large Language Models have achieved impressive performance on reasoning-intensive tasks, yet optimizing their reasoning efficiency remains an open challenge. While Test-Time Scaling (TTS) improves reasoning quality, it often leads to overthinking, wasting tokens on redundant computations. This work investigates how to efficiently and adaptively guide current model' test-time scaling without additional training. Inspired by the concept of momentum in physics, we propose Momentum Uncertainty-guided Reasoning (MUR), which dynamically allocates thinking budgets to critical reasoning steps by tracking and aggregating stepwise uncertainty over time. To support flexible inference-time control, we introduce gamma-control, a simple mechanism that tunes the reasoning budget via a single hyperparameter. We provide in-depth theoretical proof to support the superiority of MUR in terms of stability and biases. MUR is comprehensively evaluated against various TTS methods across four challenging benchmarks (MATH-500, AIME24, AIME25, and GPQA-diamond) using different sizes of recent Qwen3 models (1.7B, 4B, and 8B). Results demonstrate that MUR reduces computation by by over 45% on average while improving accuracy from 0.33 to 3.46%.

Hang Yan, Fangzhi Xu, Rongman Xu, Yifei Li, Jian Zhang, Haoran Luo, Xiaobao Wu, Luu Anh Tuan, Haiteng Zhao, Qika Lin, Jun Liu• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningMATH 500
Accuracy (Acc)84.4
543
Mathematical ReasoningAIME 24
Accuracy73.33
318
Mathematical ReasoningAIME 2024 (test)
Accuracy53.33
209
Mathematical ReasoningAIME 2025 (test)--
148
Mathematical ReasoningAIME24
Pass@1 Accuracy36.67
117
Mathematical ReasoningMATH 500
Accuracy94
79
Scientific ReasoningGPQA Diamond
Latency7.29
54
Mathematical Problem SolvingMATH
Average Time4.49
39
Mathematical Problem SolvingAIME 25
Average Time19.62
39
Mathematical Problem SolvingAIME 24
Average Time23.24
39
Showing 10 of 17 rows

Other info

Follow for update