Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Constitutional AI Alignment on SafeRLHF (test)

4.652Likert Score (5-Point)

Reflect

2.824723.299113.77354.24789Jan 26, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
4.6523.450.098
2026.01
4.6282.9170.035
2026.01
4.5934.017-
2026.01
4.5547.717-
2026.01
4.15513.261.26
2026.01
2.89547.3-