Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical and General Reasoning on DeepMATH (test)
Loading...
83.4
MATH 500 Score
BF16
56.776
63.688
70.6
77.512
Jan 20, 2026
MATH 500 Score
GSM8k Score
GPQA Score
SuperGPQA Score
Average Score
Updated 3d ago
Evaluation Results
Method
Method
Links
MATH 500 Score
GSM8k Score
GPQA Score
SuperGPQA Score
Average Score
BF16
Model=Qwen3-8B-Base, R...
2026.01
83.4
-
44.2
36.1
54.6
Jet-RL
Model=Qwen3-8B-Base, R...
2026.01
80.2
-
47.2
33.8
53.7
Before Tuning
Model=Qwen3-8B-Base, R...
2026.01
69.7
63.4
46.2
31.8
52.8
BF16-Train-FP8-Rollout
Model=Qwen3-8B-Base, R...
2026.01
57.8
-
42.6
32.6
44.3
Feedback
Search any
task
Search any
task