Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
LLM-as-a-Judge on RewardBench
Loading...
92.9
Accuracy
Qwen3-Next-80B-A3B-Thinking
88.8024
89.8662
90.93
91.9938
Jan 7, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen3-Next-80B-A3B-Thinking
Parameters=80B, Active...
2026.01
92.9
Qwen3-30B-A3B-Thinking-2507
Parameters=30B, Active...
2026.01
92.01
DeepSeek-R1
2026.01
91.18
QwQ-32B
Parameters=32B
2026.01
91.05
Qwen3-30B-A3B-Instruct-2507
Parameters=30B, Active...
2026.01
89.88
DeepSeek-V3
2026.01
89.74
Qwen2.5-32B-Instruct
Parameters=32B, Mode=I...
2026.01
89.31
Qwen3-Next-80B-A3B-Instruct
Parameters=80B, Active...
2026.01
88.96
Feedback
Search any
task
Search any
task