Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Preference Modeling on MultiPref held-out (test)
Loading...
66.4
Preference Accuracy
Mean-Var
52.776
56.313
59.85
63.387
Oct 18, 2024
Preference Accuracy
Diverging ID AUROC
Updated 1mo ago
Evaluation Results
Method
Method
Links
Preference Accuracy
Diverging ID AUROC
Mean-Var
Reward Model Type=Dist...
2024.10
66.4
0.615
Bradley-Terry
Reward Model Type=Sing...
2024.10
66.3
0.458
Skywork
Reward Model Type=Sing...
2024.10
65.1
0.494
Bradley-Terry
Reward Model Type=Sing...
2024.10
64.8
0.438
Nemotron
Reward Model Type=Sing...
2024.10
63.8
0.4
Mean-Var
Reward Model Type=Dist...
2024.10
53.3
0.549
Feedback
Search any
task
Search any
task