| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Math Reasoning | DAPO-Math 100 (test) | Pass Rate92 | 6 | |
| Mathematical Problem Solving | DAPO-Math (val) | Pass@1 Accuracy32.5 | 5 | |
| Mathematical Reasoning | DAPO-Math-17k (test) | Final Eval Reward0.627 | 3 | |
| End-to-end training performance | DAPO-MATH-17k (train) | Step Time (s)125.6 | 2 | |
| Value Modeling | DAPO-Math-17k Qwen2.5-7B-Instruct policy (Held-out) | Intra AUC0.693 | 2 | |
| Value Modeling | DAPO-Math-17k Qwen3-4B-Instruct-2507 policy (Held-out) | Intra AUC0.689 | 2 | |
| Value Modeling | DAPO-Math-17k DeepSeek-R1-Distill-Qwen-1.5B policy (Held-out) | Intra AUC71 | 2 |