Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on MATH (pass@1, pass@16)
Loading...
85.7
Pass@1
TTRL
62.716
68.683
74.65
80.617
Sep 18, 2025
Pass@1
Pass@16
Updated 1mo ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@16
TTRL
Base Model Size=8B, Tr...
2025.09
85.7
91.9
EVOL-RL
Base Model Size=8B, Tr...
2025.09
84.7
95.1
EVOL-RL
Base Model Size=8B, Tr...
2025.09
83.6
94.1
EVOL-RL
Base Model Size=8B, Tr...
2025.09
83.1
94.2
TTRL
Base Model Size=8B, Tr...
2025.09
81.1
91.1
EVOL-RL
Base Model Size=4B, Tr...
2025.09
80
93.3
EVOL-RL
Base Model Size=4B, Tr...
2025.09
79.8
93.8
EVOL-RL
Base Model Size=4B, Tr...
2025.09
79.6
93.6
TTRL
Base Model Size=4B, Tr...
2025.09
79.3
83.2
TTRL
Base Model Size=8B, Tr...
2025.09
76.8
86.2
TTRL
Base Model Size=4B, Tr...
2025.09
75.4
86.9
TTRL
Base Model Size=4B, Tr...
2025.09
73.8
84.5
Qwen3-4B-Base
Base Model Size=4B, Tr...
2025.09
67.4
89.6
Qwen3-8B-Base
Base Model Size=8B, Tr...
2025.09
63.6
91.5
Feedback
Search any
task
Search any
task