Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Alignment on Human Values (Help, Honest, Truth)
Loading...
0.7165
Helpfulness Score
MAH-DPO
0.599812
0.630106
0.6604
0.690694
Oct 1, 2025
Helpfulness Score
Honesty Score
Truthfulness Score
Updated 1d ago
Evaluation Results
Method
Method
Links
Helpfulness Score
Honesty Score
Truthfulness Score
MAH-DPO
PRM Guidance=Help PRM
2025.10
0.7165
0.4554
0.389
MAH-DPO
PRM Guidance=Honest PRM
2025.10
0.6968
0.5196
0.4107
MAH-DPO
PRM Guidance=Truth PRM
2025.10
0.6834
0.4872
0.363
MAH-DPO
2025.10
0.6389
0.3687
0.2478
MODPO
2025.10
0.6175
0.3477
0.2325
Single-Head DPO
2025.10
0.6043
0.3055
0.2014
Feedback
Search any
task
Search any
task