Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME (Pass@1 %)
Loading...
90
Pass@1 Accuracy
GPT-OSS-120B
65.6744
71.9897
78.305
84.6203
Apr 10, 2026
Pass@1 Accuracy
Updated 5d ago
Evaluation Results
Method
Method
Links
Pass@1 Accuracy
GPT-OSS-120B
Sampling Strategy=4-sa...
2026.04
90
GPT-OSS-20B
Sampling Strategy=4-sa...
2026.04
86.67
Aryabhata 2
Sampling Strategy=4-sa...
2026.04
86.67
Qwen3-30B-A3B (Thinking)
Sampling Strategy=4-sa...
2026.04
84.58
GPT-5 Mini
Sampling Strategy=4-sa...
2026.04
83.33
Nemotron 3 Nano 30B A3B
Sampling Strategy=4-sa...
2026.04
77.08
GPT-5 Nano
Sampling Strategy=4-sa...
2026.04
74.17
Gemini 2.5 Flash
Sampling Strategy=4-sa...
2026.04
66.61
Feedback
Search any
task
Search any
task