Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Math Reasoning on OpenR1
Loading...
72
Pass@8
SD-ZERO
54.32
58.91
63.5
68.09
Apr 13, 2026
Pass@8
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@8
SD-ZERO
Base Model=Qwen3-4B-In...
2026.04
72
RFT
Base Model=Qwen3-4B-In...
2026.04
70.2
SRT
Base Model=Qwen3-4B-In...
2026.04
70.2
GRPO
Base Model=Qwen3-4B-In...
2026.04
70
SDFT
Base Model=Qwen3-4B-In...
2026.04
70
GRPO
Base Model=Olmo-3-7B-I...
2026.04
70
Qwen3-4B-Instruct
Base Model=Qwen3-4B-In...
2026.04
69.2
SFT
Base Model=Qwen3-4B-In...
2026.04
69.2
SD-ZERO
Base Model=Olmo-3-7B-I...
2026.04
68
SRT
Base Model=Olmo-3-7B-I...
2026.04
66
RFT
Base Model=Olmo-3-7B-I...
2026.04
62
Olmo-3-7B-Instruct
Base Model=Olmo-3-7B-I...
2026.04
60
SDFT
Base Model=Olmo-3-7B-I...
2026.04
58
SFT
Base Model=Olmo-3-7B-I...
2026.04
55
Feedback
Search any
task
Search any
task