Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
LLM-as-a-Judge on JudgeBench
Loading...
84.19
Accuracy
DeepSeek-V3
59.4484
65.8717
72.295
78.7183
Jan 7, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
DeepSeek-V3
2026.01
84.19
Qwen3-30B-A3B-Thinking-2507
Parameters=30B, Active...
2026.01
83.87
Qwen3-Next-80B-A3B-Thinking
Parameters=80B, Active...
2026.01
82.42
DeepSeek-R1
2026.01
80.48
QwQ-32B
Parameters=32B
2026.01
79.75
Qwen3-Next-80B-A3B-Instruct
Parameters=80B, Active...
2026.01
79.45
Qwen3-30B-A3B-Instruct-2507
Parameters=30B, Active...
2026.01
74
Qwen2.5-32B-Instruct
Parameters=32B, Mode=I...
2026.01
60.4
Feedback
Search any
task
Search any
task