Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Steering effectiveness correlation analysis on Gemma-2-2B concept families (across 26 layers)
Loading...
0.871
Correlation (rho) Alin vs delta P
Difference-of-means steering
0.71292
0.75396
0.795
0.83604
Apr 16, 2026
Correlation (rho) Alin vs delta P
Correlation (rho) lambda vs delta P
Partial Correlation (r) Alin vs delta P
P-Value
Updated 1mo ago
Evaluation Results
Method
Method
Links
Correlation (rho) Alin vs delta P
Correlation (rho) lambda vs delta P
Partial Correlation (r) Alin vs delta P
P-Value
Difference-of-means steering
Family=Word transform,...
2026.04
0.871
-0.744
0.809
10
Difference-of-means steering
Family=Analogy, Model=...
2026.04
0.866
-0.923
0.647
0.001
Difference-of-means steering
Family=Sequence, Model...
2026.04
0.814
-0.83
0.869
10
Difference-of-means steering
Family=Pooled (n = 130...
2026.04
0.777
-0.706
0.507
10
Difference-of-means steering
Family=Geography, Mode...
2026.04
0.758
-0.729
0.372
0.067
Difference-of-means steering
Family=Arithmetic, Mod...
2026.04
0.719
-0.99
0.688
10
Feedback
Search any
task
Search any
task