Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on DAPO (test)
Loading...
72.8
Success Rate
TROLL
-2.912
16.744
36.4
56.056
Oct 4, 2025
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
TROLL
Model=Qwen3-8B, Advant...
2025.10
72.8
TROLL
Model=Qwen3-8B, Advant...
2025.10
71.5
TROLL
Model=Qwen3-8B, Advant...
2025.10
70.6
TROLL
Model=Qwen3-8B, Advant...
2025.10
69.1
TROLL
Model=Qwen3-8B, Advant...
2025.10
67.4
Clip
Model=Qwen3-8B, Advant...
2025.10
65.3
Clip
Model=Qwen3-8B, Advant...
2025.10
64
Clip
Model=Qwen3-8B, Advant...
2025.10
62.6
Clip
Model=Qwen3-8B, Advant...
2025.10
60.2
TROLL
Model=Qwen2.5-7B-Instr...
2025.10
39
TROLL
Model=Qwen2.5-7B-Instr...
2025.10
38.9
TROLL
Model=Qwen2.5-7B-Instr...
2025.10
38.9
TROLL
Model=Qwen2.5-7B-Instr...
2025.10
38
TROLL
Model=Qwen2.5-7B-Instr...
2025.10
35.3
Clip
Model=Qwen2.5-7B-Instr...
2025.10
33.1
Clip
Model=Qwen2.5-7B-Instr...
2025.10
32.4
Clip
Model=Qwen2.5-7B-Instr...
2025.10
32.3
Clip
Model=Qwen2.5-7B-Instr...
2025.10
32.3
Clip
Model=Qwen2.5-7B-Instr...
2025.10
9.3
Clip
Model=Qwen3-8B, Advant...
2025.10
0
Feedback
Search any
task
Search any
task