Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Steering on Corrigibility
Loading...
3.22
LLM Judge Score
DISCO-Q
1.8888
2.2344
2.58
2.9256
May 7, 2026
LLM Judge Score
Rank
Updated 26d ago
Evaluation Results
Method
Method
Links
LLM Judge Score
Rank
DISCO-Q
Backbone=LLaMA-3.1-8B-...
2026.05
3.22
4.75
SKOP
Backbone=LLaMA-3.1-8B-...
2026.05
3.19
2.69
Comm Steer
Backbone=LLaMA-3.1-8B-...
2026.05
3.01
4.63
CAA
Backbone=LLaMA-3.1-8B-...
2026.05
2.79
5.44
LoRA
Backbone=LLaMA-3.1-8B-...
2026.05
2.68
-
ITI
Backbone=LLaMA-3.1-8B-...
2026.05
2.6
4.75
SADI
Backbone=LLaMA-3.1-8B-...
2026.05
2.49
3.38
CAST
Backbone=LLaMA-3.1-8B-...
2026.05
2.28
5.63
Angular Steer
Backbone=LLaMA-3.1-8B-...
2026.05
2.18
4.75
Baseline
Backbone=LLaMA-3.1-8B-...
2026.05
1.94
-
Feedback
Search any
task
Search any
task