Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Graduate-level Science QA on GPQA (Accuracy)
Loading...
30.8
Accuracy
Qwen2.5-3B-Instruct
3.552
10.626
17.7
24.774
May 14, 2026
Accuracy
Updated 16d ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen2.5-3B-Instruct
Backbone=Qwen2.5-3B, T...
2026.05
30.8
Qwen2.5-3B-RLVR
Backbone=Qwen2.5-3B, T...
2026.05
29.3
Qwen2.5-3B-GRLO
Backbone=Qwen2.5-3B, T...
2026.05
29.3
Qwen2.5-3B-GRLO+RLVR
Backbone=Qwen2.5-3B, T...
2026.05
27.8
Qwen2.5-3B-Base
Backbone=Qwen2.5-3B, T...
2026.05
22.2
Qwen2.5-3B-MathSFT
Backbone=Qwen2.5-3B, T...
2026.05
4.6
Feedback
Search any
task
Search any
task