Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Global Evolutionary Steering: Refining Activation Steering Control via Cross-Layer Consistency

About

Activation engineering enables precise control over Large Language Models (LLMs) without the computational cost of fine-tuning. However, existing methods deriving vectors from static activation differences are susceptible to high-dimensional noise and layer-wise semantic drift, often capturing spurious correlations rather than the target intent. To address this, we propose Global Evolutionary Refined Steering (GER-steer), a training-free framework that grounded in the geometric stability of the network's representation evolution. GER-steer exploits this global signal to rectify raw steering vectors, effectively decoupling robust semantic intent from orthogonal artifacts. Extensive evaluations confirm that GER-steer consistently outperforms baselines, delivering superior efficacy and generalization without layer-specific tuning, establishing a universal solution for reliable model alignment.

Xinyan Jiang, Wenjing Yu, Di Wang, Lijie Hu• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy89.4
1362
Factuality EvaluationTruthfulQA
MC294.3
73
Safety RefusalAdvBench
Refusal Rate77.5
46
Sentiment AnalysisSST-2
Positive Rate52.5
24
AI Text DetectionHC3 AI-Prob
Finance Accuracy51.5
24
Showing 5 of 5 rows

Other info

Follow for update