Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DAPO-Math

Benchmarks

Task NameDataset NameSOTA ResultTrend
Math ReasoningDAPO-Math 100 (test)
Pass Rate92
6
Mathematical Problem SolvingDAPO-Math (val)
Pass@1 Accuracy32.5
5
Mathematical ReasoningDAPO-Math-17k (test)
Final Eval Reward0.627
3
End-to-end training performanceDAPO-MATH-17k (train)
Step Time (s)125.6
2
Value ModelingDAPO-Math-17k Qwen2.5-7B-Instruct policy (Held-out)
Intra AUC0.693
2
Value ModelingDAPO-Math-17k Qwen3-4B-Instruct-2507 policy (Held-out)
Intra AUC0.689
2
Value ModelingDAPO-Math-17k DeepSeek-R1-Distill-Qwen-1.5B policy (Held-out)
Intra AUC71
2
Showing 7 of 7 rows