Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME 2024-2026 & HMMT 2025 Suite
Loading...
1.88
Accuracy Change
TRS
-1.2296
-0.4223
0.385
1.1923
Apr 23, 2026
Accuracy Change
Output Token Change (%)
Cost Change (%)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy Change
Output Token Change (%)
Cost Change (%)
TRS
Model=Doubao-1.8
2026.04
1.88
4.9
2.8
TRS
Model=Gemini-3-Flash
2026.04
0.92
8.6
9
TRS
Model=GPT-OSS-120B
2026.04
0.5
8.1
6.5
TRS
Model=GPT-OSS-20B
2026.04
-0.6
5.3
4.3
TRS
Model=Doubao-2.0-Pro
2026.04
-1.11
17.5
15.7
Feedback
Search any
task
Search any
task