Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AMOBench (avg@8)
Loading...
16
Avg@8 Score
SRT
0.712
4.681
8.65
12.619
Apr 13, 2026
Avg@8 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg@8 Score
SRT
Base Model=Qwen3-4B-In...
2026.04
16
SD-ZERO
Base Model=Qwen3-4B-In...
2026.04
16
RFT
Base Model=Qwen3-4B-In...
2026.04
11.3
GRPO
Base Model=Qwen3-4B-In...
2026.04
11
Qwen3-4B-Instruct
Base Model=Qwen3-4B-In...
2026.04
9.8
SDFT
Base Model=Qwen3-4B-In...
2026.04
9
SFT
Base Model=Qwen3-4B-In...
2026.04
7.3
SD-ZERO
Base Model=Olmo-3-7B-I...
2026.04
5.5
GRPO
Base Model=Olmo-3-7B-I...
2026.04
4.8
SRT
Base Model=Olmo-3-7B-I...
2026.04
3.5
RFT
Base Model=Olmo-3-7B-I...
2026.04
2.3
Olmo-3-7B-Instruct
Base Model=Olmo-3-7B-I...
2026.04
1.3
SFT
Base Model=Olmo-3-7B-I...
2026.04
1.3
SDFT
Base Model=Olmo-3-7B-I...
2026.04
1.3
Feedback
Search any
task
Search any
task