Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LLM-as-a-Judge on RewardBench

92.9Accuracy

Qwen3-Next-80B-A3B-Thinking

88.802489.866290.9391.9938Jan 7, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
92.9
2026.01
92.01
91.18
2026.01
91.05
2026.01
89.88
89.74
2026.01
89.31
2026.01
88.96