Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on MATH (Correction Uplift %)
Loading...
48.87
Correction Uplift
ROSA
2.2572
14.3586
26.46
38.5614
Sep 27, 2025
Correction Uplift
Updated 1mo ago
Evaluation Results
Method
Method
Links
Correction Uplift
ROSA
Model=Qwen3-0.6B
2025.09
48.87
ROSA
Model=Qwen2.5-7B-Instruct
2025.09
41.53
ROSA
Model=Qwen3-8B
2025.09
40.42
ROSA
Model=Qwen2.5-0.5B-Ins...
2025.09
25.48
Baseline
Model=Qwen3-8B
2025.09
23
Baseline
Model=Qwen3-0.6B
2025.09
17.4
Baseline
Model=Qwen2.5-7B-Instruct
2025.09
12.54
Baseline
Model=Qwen2.5-0.5B-Ins...
2025.09
6.88
ROSA
Model=DeepSeek-R1-Dist...
2025.09
6.3
Baseline
Model=DeepSeek-R1-Dist...
2025.09
4.05
Feedback
Search any
task
Search any
task