Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME decontaminated 24 (Accuracy)
Loading...
27.9
Accuracy
DAPO-Math-17k
3.772
10.036
16.3
22.564
May 26, 2026
Accuracy
Updated 7d ago
Evaluation Results
Method
Method
Links
Accuracy
DAPO-Math-17k
Backbone=Qwen3-8B, Sam...
2026.05
27.9
DAPO++
Backbone=Qwen3-8B, Sam...
2026.05
25.9
DeepScaleR
Backbone=Qwen3-8B, Sam...
2026.05
23.4
Skywork-OR1-RL-Data
Backbone=Qwen3-8B, Sam...
2026.05
17.3
DeepMath-103K
Backbone=Qwen3-8B, Sam...
2026.05
17.1
OpenR1-Math-220k
Backbone=Qwen3-8B, Sam...
2026.05
16.5
Qwen3-8B-Base
Backbone=Qwen3-8B, Sam...
2026.05
10.4
DAPO++
Backbone=Qwen3-1.7B, S...
2026.05
9.4
DeepMath-103K
Backbone=Qwen3-1.7B, S...
2026.05
8.4
DAPO-Math-17k
Backbone=Qwen3-1.7B, S...
2026.05
7
OpenR1-Math-220k
Backbone=Qwen3-1.7B, S...
2026.05
6.9
DeepScaleR
Backbone=Qwen3-1.7B, S...
2026.05
6.3
Skywork-OR1-RL-Data
Backbone=Qwen3-1.7B, S...
2026.05
5.8
Qwen3-1.7B-Base
Backbone=Qwen3-1.7B, S...
2026.05
4.7
Feedback
Search any
task
Search any
task