Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on AIME 2024 (Pass@32, Mem, Len)
Loading...
93
Pass@32
TFGO
17.08
36.79
56.5
76.21
Dec 12, 2025
Pass@32
Memory Usage
Output Length
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@32
Memory Usage
Output Length
TFGO
Base Model=DeepSeek-V3...
2025.12
93
-
696
Vanilla
Base Model=Qwen3-Max
2025.12
93
-
-
MNL
Base Model=Qwen3-Max
2025.12
93
10
0
MNL
Base Model=DeepSeek-V3...
2025.12
90
9
60
TFGO
Base Model=Qwen3-Max
2025.12
90
-
1,452
Vanilla
Base Model=DeepSeek-V3...
2025.12
87
-
-
ACE
Base Model=DeepSeek-V3...
2025.12
80
163
21,318
MNL
Base Model=Qwen3-8B
2025.12
33
51
67
Vanilla
Base Model=Qwen3-8B
2025.12
30
-
-
ACE
Base Model=Qwen3-8B
2025.12
27
100
7,355
TFGO
Base Model=Qwen3-8B
2025.12
23
-
703
Memento
Base Model=Qwen3-8B
2025.12
20
100
3,100
Feedback
Search any
task
Search any
task