Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Judge Accuracy on Ultra-Problem (Bench)
Loading...
73
Accuracy
Biased Rubric Search
72.792
72.846
72.9
72.954
Feb 14, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Biased Rubric Search
Judge Model=Qwen3-14B
2026.02
73
Seed Rubric
Judge Model=Qwen3-14B
2026.02
72.8
Feedback
Search any
task
Search any
task