Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reward Modeling on RM-Bench Easy
Loading...
92.2
Accuracy
Llama-3.1-Nemotron-70B
69.632
75.491
81.35
87.209
Feb 9, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Llama-3.1-Nemotron-70B
Size=70B
2026.02
92.2
INF-ORM-Llama3.1-70B
Backbone=Llama3.1-70B
2026.02
92.1
Athene-RM-8B
Size=8B
2026.02
89.8
Skywork-Reward-Gemma-2-27B-v0.2
Backbone=Gemma-2-27B
2026.02
88.9
Llama-3-OffsetBias-RM-8B
Size=8B
2026.02
83.9
WILDREWARD-8B
Size=8B
2026.02
83.5
WILDREWARD-4B
Size=4B
2026.02
82
ArmoRM-Llama3-8B-v0.1
Backbone=Llama3-8B
2026.02
80.4
Internlm2-20b-reward
Size=20b
2026.02
79.4
Skywork-Reward-Llama-3.1-8B-v0.2
Backbone=Llama-3.1-8B
2026.02
70.5
Feedback
Search any
task
Search any
task