MUR: Momentum Uncertainty guided Reasoning for Large Language Models

About

Large Language Models have achieved impressive performance on reasoning-intensive tasks, yet optimizing their reasoning efficiency remains an open challenge. While Test-Time Scaling (TTS) improves reasoning quality, it often leads to overthinking, wasting tokens on redundant computations. This work investigates how to efficiently and adaptively guide current model' test-time scaling without additional training. Inspired by the concept of momentum in physics, we propose Momentum Uncertainty-guided Reasoning (MUR), which dynamically allocates thinking budgets to critical reasoning steps by tracking and aggregating stepwise uncertainty over time. To support flexible inference-time control, we introduce gamma-control, a simple mechanism that tunes the reasoning budget via a single hyperparameter. We provide in-depth theoretical proof to support the superiority of MUR in terms of stability and biases. MUR is comprehensively evaluated against various TTS methods across four challenging benchmarks (MATH-500, AIME24, AIME25, and GPQA-diamond) using different sizes of recent Qwen3 models (1.7B, 4B, and 8B). Results demonstrate that MUR reduces computation by by over 45% on average while improving accuracy from 0.33 to 3.46%.

Hang Yan, Fangzhi Xu, Rongman Xu, Yifei Li, Jian Zhang, Haoran Luo, Xiaobao Wu, Luu Anh Tuan, Haiteng Zhao, Qika Lin, Jun Liu• 2025

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	MATH 500	Accuracy (Acc)84.4	600
Mathematical Reasoning	AIME 24	Accuracy73.33	358
Mathematical Reasoning	AIME 2024 (test)	Accuracy53.33	294
Mathematical Reasoning	AIME 2025 (test)	--	191
Mathematical Reasoning	AIME24	Pass@1 Accuracy36.67	117
Mathematical Reasoning	MATH 500	Accuracy94	79
Scientific Reasoning	GPQA Diamond	Latency7.29	54
Mathematical Problem Solving	MATH	Average Time4.49	39
Mathematical Problem Solving	AIME 25	Average Time19.62	39
Mathematical Problem Solving	AIME 24	Average Time23.24	39

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord