Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LLM-as-a-Judge on JudgeBench

84.19Accuracy

DeepSeek-V3

59.448465.871772.29578.7183Jan 7, 2026
Updated 4d ago

Evaluation Results

MethodLinks
84.19
2026.01
83.87
2026.01
82.42
80.48
2026.01
79.75
2026.01
79.45
2026.01
74
2026.01
60.4