Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Model Alignment on HH-RLHF D2 (test)
Loading...
20.13
Harmlessness BLEU
DEFT-DPO
7.2236
10.5743
13.925
17.2757
Apr 2, 2026
Harmlessness BLEU
Harmlessness BARTScore
Harmlessness Reward
Helpfulness BLEU
Helpfulness BARTScore
Helpfulness Reward
Total BLEU
Total BARTScore
Total Reward
Updated 15d ago
Evaluation Results
Method
Method
Links
Harmlessness BLEU
Harmlessness BARTScore
Harmlessness Reward
Helpfulness BLEU
Helpfulness BARTScore
Helpfulness Reward
Total BLEU
Total BARTScore
Total Reward
DEFT-DPO
trained_on=D2
2026.04
20.13
2.87
65.35
30.08
3.15
60.21
27.39
3.07
61.6
DPO
trained_on=D2
2026.04
17.04
2.25
59.51
28.4
2.69
57.05
25.33
2.56
57.72
DEFT-PRO
trained_on=D2
2026.04
8.54
1.77
62.21
22.58
2.7
58.43
18.78
2.45
59.45
SFT
trained_on=D2
2026.04
7.79
1.77
60.89
19.46
1.99
50.65
16.3
1.93
53.42
PRO
trained_on=D2
2026.04
7.72
1.75
61.3
20.27
2.06
53.07
16.87
1.98
55.29
Feedback
Search any
task
Search any
task