Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME Aya (Complete)
Loading...
65.8
Accuracy
Input-all
33.664
42.007
50.35
58.693
May 21, 2026
Accuracy
Updated 8d ago
Evaluation Results
Method
Method
Links
Accuracy
Input-all
k=5, Summarizer=Aya
2026.05
65.8
Model-first Greedy
k=5, Summarizer=Aya
2026.05
65.4
Truth-prediction Greedy
k=5, Summarizer=Aya
2026.05
61.1
Oracle-surrogate Greedy
k=5, Summarizer=Aya
2026.05
60.7
MoA
k=5, Summarizer=Aya
2026.05
58
Conditioned-diversity
k=5, Summarizer=Aya
2026.05
48.3
GPT5.2-judge
k=5, Summarizer=Aya
2026.05
44.8
Aya-judge
k=5, Summarizer=Aya
2026.05
43.5
Top-accuracy
k=5, Summarizer=Aya
2026.05
37.7
Best-model
k=5, Summarizer=Aya
2026.05
34.9
Feedback
Search any
task
Search any
task