Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Problem Solving on AIME 2025 (Pass@1)
Loading...
95
Pass@1
Gemini-3.0 Pro
19.392
39.021
58.65
78.279
Jul 17, 2025
Aug 9, 2025
Sep 1, 2025
Sep 24, 2025
Oct 17, 2025
Nov 9, 2025
Dec 2, 2025
Pass@1
Updated 17d ago
Evaluation Results
Method
Method
Links
Pass@1
Gemini-3.0 Pro
template=reason step b...
2025.12
95
GPT-5 High
template=reason step b...
2025.12
94.6
Kimi-K2
thinking mode=true, te...
2025.12
94.5
DeepSeek-V3.2
thinking mode=true, te...
2025.12
93.1
Claude-4.5-Sonnet
template=reason step b...
2025.12
87
MiniMax M2
template=reason step b...
2025.12
78.3
Qwen3-8B
k (responses per quest...
2025.07
67.3
QUESTA-Nemotron-1.5B
k (responses per quest...
2025.07
62.29
DeepSeek-R1-Distill-32B
k (responses per quest...
2025.07
51.8
Nemotron-1.5B
k (responses per quest...
2025.07
49.5
Qwen3-1.7B
k (responses per quest...
2025.07
36.8
DeepSeek-R1-Distill-1.5B
k (responses per quest...
2025.07
22.3
Feedback
Search any
task
Search any
task