Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Model Alignment on HH-RLHF D3 (test)
Loading...
32.77
Harmlessness BLEU Score
DEFT-PRO
28.8804
29.8902
30.9
31.9098
Apr 2, 2026
Harmlessness BLEU Score
Harmlessness BARTScore
Harmlessness Reward Score
Helpfulness BLEU Score
Helpfulness BARTScore
Helpfulness Reward Score
Total BLEU Score
Total BARTScore
Total Reward Score
Updated 15d ago
Evaluation Results
Method
Method
Links
Harmlessness BLEU Score
Harmlessness BARTScore
Harmlessness Reward Score
Helpfulness BLEU Score
Helpfulness BARTScore
Helpfulness Reward Score
Total BLEU Score
Total BARTScore
Total Reward Score
DEFT-PRO
trained_on=D3
2026.04
32.77
3.79
73.79
34.66
3.65
71.24
34.15
3.69
71.93
DEFT-DPO
trained_on=D3
2026.04
32.03
3.95
71.45
36.77
4.16
73.12
35.49
4.1
72.67
SFT
trained_on=D3
2026.04
31.76
3.86
72.48
34.91
3.84
68.54
34.06
3.85
69.6
PRO
trained_on=D3
2026.04
29.4
3.56
72.95
33.5
3.64
68.49
33.5
3.62
69.69
DPO
trained_on=D3
2026.04
29.03
3.88
74.23
34.79
4.04
69.27
33.23
4
70.61
Feedback
Search any
task
Search any
task