Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Preference Modeling on HelpSteer2 held-out (test)

68.4Preference Accuracy

Mean-Var

56.9659.9362.965.87Oct 18, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.10
68.40.582
2024.10
68.30.482
2024.10
67.80.489
2024.10
67.50.481
2024.10
66.90.488
2024.10
65.90.648
2024.10
57.40.573