Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Graduate-Level Reasoning on GPQA (Pass@1)
Loading...
70.18
Pass@1
Phi-4-mini + Mistral3-3B
24.2328
36.1614
48.09
60.0186
Jan 29, 2026
Pass@1
Updated 3d ago
Evaluation Results
Method
Method
Links
Pass@1
Phi-4-mini + Mistral3-3B
Training=CORE
2026.01
70.18
Phi-4-mini + Mistral3-3B + Oracle
Training=SD-E²
2026.01
48.12
Ministral-3-8B-Reasoning
Training=SD-E²
2026.01
33
Qwen2.5-7B-Instruct
Training=SD-E²
2026.01
30
Phi-3-small-8k-Instruct
Training=SD-E²
2026.01
26
Feedback
Search any
task
Search any
task