Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Response Diversity on Anthropic HH-RLHF
Loading...
82.5
Preference Coverage
EvoPref
66.484
70.642
74.8
78.958
May 10, 2026
Preference Coverage
Self-BLEU
Collapse Rate
Hypervolume
Updated 21d ago
Evaluation Results
Method
Method
Links
Preference Coverage
Self-BLEU
Collapse Rate
Hypervolume
EvoPref
type=Multi-Objective E...
2026.05
82.5
0.297
11
0.84
SMS-EMOA
type=Multi-Objective E...
2026.05
78.2
0.327
14.1
0.82
MOEA/D
type=Multi-Objective E...
2026.05
76.9
0.342
15.5
0.8
CMA-ES
type=Single-Objective...
2026.05
72.5
0.361
18.1
0.71
ORPO
optimizer=AdamW, learn...
2026.05
70
0.389
20.6
-
DPO
beta=0.1, optimizer=Ad...
2026.05
67.1
0.414
23.3
-
Feedback
Search any
task
Search any
task