Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Knowledge Reasoning on GPQA Diamond (Accuracy)
Loading...
62.5
Accuracy
General Teacher
24.748
34.549
44.35
54.151
Aug 12, 2025
Sep 28, 2025
Nov 15, 2025
Jan 2, 2026
Feb 19, 2026
Apr 8, 2026
May 26, 2026
Accuracy
Updated 7d ago
Evaluation Results
Method
Method
Links
Accuracy
General Teacher
2026.05
62.5
Medical Teacher
2026.05
62.37
CaMOPD
2026.05
61.99
Relaxed OPD
2026.05
61.49
Qwen2.5-32B-Instruct + Bootcamp-SFT-RL
Model=Qwen2.5-32B-Inst...
2025.08
60.7
Vanilla MOPD
2026.05
60.1
SelecTKD
2026.05
58.46
DS-R1-Distilled-Qwen-32B + Bootcamp-RL
Model=DS-R1-Distilled-...
2025.08
51.6
Qwen2.5-32B-Instruct
Model=Qwen2.5-32B-Inst...
2025.08
44.7
Qwen2.5-32B-Instruct + Bootcamp-RL
Model=Qwen2.5-32B-Inst...
2025.08
44.7
DS-R1-Distilled-Qwen-32B
Model=DS-R1-Distilled-...
2025.08
41.6
Qwen2.5-32B-Instruct + Bootcamp-SFT
Model=Qwen2.5-32B-Inst...
2025.08
26.2
Feedback
Search any
task
Search any
task