Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reward Modeling on Unified-Feedback (ID)

73.9Accuracy

GRM w/ dpo-noref

63.39666.12368.8571.577Jun 14, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.06
73.9
2024.06
73.8
2024.06
73.2
2024.06
72.8
2024.06
72.1
2024.06
72
2024.06
71.5
2024.06
63.8