Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reward Modeling on MT-Bench OOD (test)

73Score

GRM w/ sft

68.00869.30470.671.896Jun 14, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.06
73-
2024.06
72.1-
2024.06
71.9-
2024.06
71.3-
2024.06
71.1-
2024.06
71-
2024.06
69.1-
2024.06
68.2-
2024.06
-69.5
2024.06
-71.2
2024.06
-72.6
2024.06
-71.2
2024.06
-73.7
2024.06
-73.4
2024.06
-73
2024.06
-73.4