Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reward Modeling on RM-Bench Hard

0.697Accuracy

WILDREWARD-8B

0.409960.484480.5590.63352Feb 9, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
0.697
2026.02
0.686
2026.02
0.628
0.569
2026.02
0.558
2026.02
0.54
2026.02
0.514
2026.02
0.493
2026.02
0.478
2026.02
0.421