Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Identifying and Transferring Reasoning-Critical Neurons: Improving LLM Inference Reliability via Activation Steering

About

Despite the strong reasoning capabilities of recent large language models (LLMs), achieving reliable performance on challenging tasks often requires post-training or computationally expensive sampling strategies, limiting their practical efficiency. In this work, we first show that a small subset of neurons in LLMs exhibits strong predictive correlations with reasoning correctness. Based on this observation, we propose AdaRAS (Adaptive Reasoning Activation Steering), a lightweight test-time framework that improves reasoning reliability by selectively intervening on neuron activations. AdaRAS identifies Reasoning-Critical Neurons (RCNs) via a polarity-aware mean-difference criterion and adaptively steers their activations during inference, enhancing incorrect reasoning traces while avoiding degradation on already-correct cases. Experiments on 10 mathematics and coding benchmarks demonstrate consistent improvements, including over 13% gains on AIME-24 and AIME-25. Moreover, AdaRAS exhibits strong transferability across datasets and scalability to stronger models, outperforming post-training methods without additional training or sampling cost.

Fangan Dong, Zuming Yan, Xuri Ge, Zhiwei Xu, Mengqi Zhang, Xuanang Chen, Ben He, Xin Xin, Zhumin Chen, Ying Zhou• 2026

Related benchmarks

TaskDatasetResultRank
MathGSM8K
Accuracy0.8908
87
Mathematical Problem SolvingAIME 25
Accuracy54.55
54
CodeHumanEval
HumanEval Accuracy79.19
50
CodingMBPP
Accuracy72.22
31
MathMATH 500
Accuracy86.4
25
CodeHumanEval+
Accuracy73.15
22
MathAIME24
Accuracy60.87
20
CodeMBPP+
Accuracy60.58
6
MathAIME-Extend
Accuracy52.67
6
MathAMC-12
Accuracy70.33
6
Showing 10 of 10 rows

Other info

Follow for update