Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Constitutional AI Alignment on SafeRLHF (test)
Loading...
4.652
Likert Score (5-Point)
Reflect
2.82472
3.29911
3.7735
4.24789
Jan 26, 2026
Likert Score (5-Point)
Principle Violations
Delta Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Likert Score (5-Point)
Principle Violations
Delta Score
Reflect
Backbone Model=GPT-4.1...
2026.01
4.652
3.45
0.098
Reflect
Backbone Model=Mistral-7B
2026.01
4.628
2.917
0.035
CCBase
Backbone Model=Mistral-7B
2026.01
4.593
4.017
-
CCBase
Backbone Model=GPT-4.1...
2026.01
4.554
7.717
-
Reflect
Backbone Model=Claude-...
2026.01
4.155
13.26
1.26
CCBase
Backbone Model=Claude-...
2026.01
2.895
47.3
-
Feedback
Search any
task
Search any
task