Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Beyond AIME

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningBeyond-AIME
Avg@5 Score77.6
48
Mathematical ReasoningBeyond-AIME v1 (test)
Avg@576.6
32
Mathematical ReasoningBeyond AIME
Accuracy58.8
32
Mathematical ReasoningBeyond AIME
Pass@122
21
Mathematical ReasoningBeyond-AIME
Seed Score76.6
16
Mathematical ReasoningBeyond-AIME
Pass@138.9
10
Mathematical ReasoningBeyond-AIME VeRA-H VeRA-H Pro
Avg@5 Accuracy (Seeds)58.34
1
Mathematical ReasoningBeyond-AIME VeRA-E
Avg@5 Accuracy (Seeds)58.34
1
Showing 8 of 8 rows