Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematics on AIME Art of Problem Solving 2025 2026 (Score)
Loading...
100
Score
AgentSPEX
94.384
95.842
97.3
98.758
Apr 14, 2026
Score
Updated 3d ago
Evaluation Results
Method
Method
Links
Score
AgentSPEX
Model=GPT-5, Domain=Ma...
2026.04
100
CoT
Model=GPT-5 (with Pyth...
2026.04
99.6
CoT
Model=GPT-5 (without t...
2026.04
94.6
Feedback
Search any
task
Search any
task