Identifying and Transferring Reasoning-Critical Neurons: Improving LLM Inference Reliability via Activation Steering

About

Despite the strong reasoning capabilities of recent large language models (LLMs), achieving reliable performance on challenging tasks often requires post-training or computationally expensive sampling strategies, limiting their practical efficiency. In this work, we first show that a small subset of neurons in LLMs exhibits strong predictive correlations with reasoning correctness. Based on this observation, we propose AdaRAS (Adaptive Reasoning Activation Steering), a lightweight test-time framework that improves reasoning reliability by selectively intervening on neuron activations. AdaRAS identifies Reasoning-Critical Neurons (RCNs) via a polarity-aware mean-difference criterion and adaptively steers their activations during inference, enhancing incorrect reasoning traces while avoiding degradation on already-correct cases. Experiments on 10 mathematics and coding benchmarks demonstrate consistent improvements, including over 13% gains on AIME-24 and AIME-25. Moreover, AdaRAS exhibits strong transferability across datasets and scalability to stronger models, outperforming post-training methods without additional training or sampling cost.

Fangan Dong, Zuming Yan, Xuri Ge, Zhiwei Xu, Mengqi Zhang, Xuanang Chen, Ben He, Xin Xin, Zhumin Chen, Ying Zhou• 2026

Related benchmarks

Task	Dataset	Result
Math	GSM8K	Accuracy0.8908	216
Coding	MBPP	Accuracy72.22	145
Math	MATH 500	Accuracy86.4	120
Code	HumanEval	HumanEval Accuracy79.19	118
Mathematical Problem Solving	AIME 25	Accuracy54.55	71
Math	AIME24	Accuracy60.87	57
Code	HumanEval+	Accuracy73.15	43
Code	MBPP+	Accuracy60.58	6
Math	AIME-Extend	Accuracy52.67	6
Math	AMC-12	Accuracy70.33	6

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord