Share your thoughts, 1 month free Claude Pro on usSee more

Preference Modeling on MultiPref held-out (test)

66.4Preference Accuracy

Mean-Var

Updated 4mo ago

Evaluation Results

Method	Links
Mean-Var 2024.10		66.4	0.615
Bradley-Terry 2024.10		66.3	0.458
Skywork 2024.10		65.1	0.494
Bradley-Terry 2024.10		64.8	0.438
Nemotron 2024.10		63.8	0.4
Mean-Var 2024.10		53.3	0.549