Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Knowledge Reasoning on Super GPQA (Accuracy)
Loading...
51
Accuracy
Qwen2.5-32B-Instruct + Bootcamp-SFT-RL
37.48
40.99
44.5
48.01
Aug 12, 2025
Accuracy
Updated 13d ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen2.5-32B-Instruct + Bootcamp-SFT-RL
Model=Qwen2.5-32B-Inst...
2025.08
51
DS-R1-Distilled-Qwen-32B + Bootcamp-RL
Model=DS-R1-Distilled-...
2025.08
48.7
Qwen2.5-32B-Instruct + Bootcamp-SFT
Model=Qwen2.5-32B-Inst...
2025.08
48.5
DS-R1-Distilled-Qwen-32B
Model=DS-R1-Distilled-...
2025.08
45.9
Qwen2.5-32B-Instruct
Model=Qwen2.5-32B-Inst...
2025.08
39.3
Qwen2.5-32B-Instruct + Bootcamp-RL
Model=Qwen2.5-32B-Inst...
2025.08
38
Feedback
Search any
task
Search any
task