Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Alignment-diversity coverage on HH-RLHF
Loading...
81.7
Coverage
EvoPref
32.82
45.51
58.2
70.89
May 21, 2026
Coverage
Reward
Collapse Ratio
Updated 9d ago
Evaluation Results
Method
Method
Links
Coverage
Reward
Collapse Ratio
EvoPref
2026.05
81.7
0.83
0.18
Group DPO
2026.05
68.4
0.84
0.24
MO-RLHF
2026.05
63.8
0.85
0.29
DPO-Ensemble
ensemble size=28
2026.05
54.3
0.86
0.41
DPO
type=single
2026.05
34.7
0.88
0.65
Feedback
Search any
task
Search any
task