Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Accuracy on AIME 25 (Mathematics Competition)
Loading...
27.9
Accuracy
TaH+
0.236
7.418
14.6
21.782
Nov 11, 2025
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
TaH+
Model Size=4B*
2025.11
27.9
TaH
Model Size=4B*
2025.11
27.1
SoftThink
Model Size=4B*
2025.11
25.8
Ouro
Model Size=4B*
2025.11
25
Standard
Model Size=4B*
2025.11
24.2
TaH+
Model Size=1.7B
2025.11
15.4
TaH
Model Size=1.7B
2025.11
13.8
Standard
Model Size=1.7B
2025.11
10.8
Ouro
Model Size=1.7B
2025.11
10.8
AlwaysThink
Model Size=1.7B
2025.11
7.5
SoftThink
Model Size=1.7B
2025.11
5.4
TaH+
Model Size=0.6B
2025.11
4.6
SoftThink
Model Size=0.6B
2025.11
2.9
Ouro
Model Size=0.6B
2025.11
2.1
TaH
Model Size=0.6B
2025.11
2.1
Standard
Model Size=0.6B
2025.11
1.9
AlwaysThink
Model Size=0.6B
2025.11
1.3
Feedback
Search any
task
Search any
task