Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning and Coding on DAPO-17k
Loading...
49.47
Peak Accuracy@8
seq-agg
35.2324
38.9287
42.625
46.3213
Apr 14, 2026
Peak Accuracy@8
Peak Best Accuracy@8
Last-Step Accuracy@8
Last-Step Best Accuracy@8
Updated 27d ago
Evaluation Results
Method
Method
Links
Peak Accuracy@8
Peak Best Accuracy@8
Last-Step Accuracy@8
Last-Step Best Accuracy@8
seq-agg
Model=Qwen3-1.7B, Aggr...
2026.04
49.47
59.9
44.81
55.22
balanced-agg
Model=Qwen3-1.7B, Aggr...
2026.04
49.28
60.86
46.95
58.61
token-agg
Model=Qwen3-1.7B, Aggr...
2026.04
48.76
60.58
43.6
56.08
balanced-agg
Model=Qwen2.5-Math-7B,...
2026.04
36.34
48.96
34.24
46.27
token-agg
Model=Qwen2.5-Math-7B,...
2026.04
35.95
48.69
33.64
44.34
seq-agg
Model=Qwen2.5-Math-7B,...
2026.04
35.78
48.21
34.46
45.98
Feedback
Search any
task
Search any
task