Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reward Modeling on Unified-Feedback (ID)

73.9Accuracy

GRM w/ dpo-noref

63.39666.12368.8571.577Jun 14, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.06
73.9
2024.06
73.8
2024.06
73.2
2024.06
72.8
2024.06
72.1
2024.06
72
2024.06
71.5
2024.06
63.8