Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on AIME 2025 (Pass@32, Memory, Length)
Loading...
96
Pass@32
Vanilla
6.56
29.78
53
76.22
Dec 12, 2025
Pass@32
Memory Usage
Sequence Length
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@32
Memory Usage
Sequence Length
Vanilla
Base Model=Qwen3-Max
2025.12
96
-
-
MNL
Base Model=Qwen3-Max
2025.12
96
10
0
TFGO
Base Model=DeepSeek-V3...
2025.12
90
-
696
TFGO
Base Model=Qwen3-Max
2025.12
90
-
1,452
MNL
Base Model=DeepSeek-V3...
2025.12
83
9
60
Vanilla
Base Model=DeepSeek-V3...
2025.12
80
-
-
ACE
Base Model=DeepSeek-V3...
2025.12
67
163
21,318
MNL
Base Model=Qwen3-8B
2025.12
30
51
67
Memento
Base Model=Qwen3-8B
2025.12
27
100
3,100
Vanilla
Base Model=Qwen3-8B
2025.12
23
-
-
TFGO
Base Model=Qwen3-8B
2025.12
23
-
703
ACE
Base Model=Qwen3-8B
2025.12
10
100
7,355
Feedback
Search any
task
Search any
task