Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Math Reasoning on MATH (Pass@8)
Loading...
98.4
Pass@8
RFT
95.904
96.552
97.2
97.848
Apr 13, 2026
Pass@8
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@8
RFT
Base Model=Qwen3-4B-In...
2026.04
98.4
GRPO
Base Model=Qwen3-4B-In...
2026.04
98
Olmo-3-7B-Instruct
Base Model=Olmo-3-7B-I...
2026.04
98
SRT
Base Model=Olmo-3-7B-I...
2026.04
98
Qwen3-4B-Instruct
Base Model=Qwen3-4B-In...
2026.04
97.6
SRT
Base Model=Qwen3-4B-In...
2026.04
97.6
SFT
Base Model=Qwen3-4B-In...
2026.04
97
SDFT
Base Model=Qwen3-4B-In...
2026.04
97
SD-ZERO
Base Model=Qwen3-4B-In...
2026.04
97
RFT
Base Model=Olmo-3-7B-I...
2026.04
97
GRPO
Base Model=Olmo-3-7B-I...
2026.04
97
SDFT
Base Model=Olmo-3-7B-I...
2026.04
97
SD-ZERO
Base Model=Olmo-3-7B-I...
2026.04
97
SFT
Base Model=Olmo-3-7B-I...
2026.04
96
Feedback
Search any
task
Search any
task