Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reward Modeling on IFBench Simple
Loading...
87.2
Accuracy
Skywork-Reward-Gemma-2-27B
9.824
29.912
50
70.088
Feb 26, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Skywork-Reward-Gemma-2-27B
2025.02
87.2
GPT-4o
2025.02
85.1
REWARDAGENT_MINI
Search Engine=true
2025.02
85.1
o3-mini
2025.02
81.9
INF-ORM-Llama3.1-70B
2025.02
78.7
Skywork-Reward-Llama-3.1-8B-v0.2
2025.02
78.7
REWARDAGENT_MINI
Search Engine=false
2025.02
78.7
internlm2-7b-reward
2025.02
74.5
internlm2-20b-reward
2025.02
74.5
REWARDAGENT_LLAMA
Search Engine=true
2025.02
74.5
ArmoRM-Llama3-8B-v0.1
2025.02
72.3
DeepSeek-R1
2025.02
72.3
GPT-4o mini
2025.02
70.2
REWARDAGENT_LLAMA
Search Engine=false
2025.02
70.2
DeepSeek-R1-Distill-Llama-8B
2025.02
53.2
Llama3-8B Instruct
2025.02
12.8
Feedback
Search any
task
Search any
task