Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on GSM8K (test) (Mean |Δs|, # Steerable)
Loading...
0.06
Mean Absolute Difference (|Δs|)
Activation Steering (Qwen3-0.6B)
0.0436
0.1543
0.265
0.3757
May 9, 2026
Mean Absolute Difference (|Δs|)
# Steerable
Updated 22d ago
Evaluation Results
Method
Method
Links
Mean Absolute Difference (|Δs|)
# Steerable
Activation Steering (Qwen3-0.6B)
Model=Qwen3-0.6B
2026.05
0.06
1
Activation Steering (Qwen3-14B)
Model=Qwen3-14B
2026.05
0.17
2
Activation Steering (Qwen3-30B)
Model=Qwen3-30B
2026.05
0.18
2
Activation Steering (Llama-4)
Model=Llama-4
2026.05
0.31
2
Activation Steering (Qwen3-235B)
Model=Qwen3-235B
2026.05
0.47
4
Feedback
Search any
task
Search any
task