Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

IMO-AnswerBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Math ReasoningIMO-AnswerBench 50
Pass@1 Accuracy40.5
68
Mathematical ReasoningIMO-AnswerBench
Accuracy84.5
20
MathIMO-ANSWERBENCH
Score59.3
13
Mathematical ReasoningIMO-AnswerBench
Pass@183.3
12
Multiple-choice Question AnsweringIMO-AnswerBench (full)
NComm22.8
10
Reasoning & GeneralIMO-AnswerBench
Score86.3
7
Mathematical ReasoningIMO-AnswerBench
Pass@125.62
6
Mathematical ReasoningIMO-AnswerBench (test)
Pass@125.62
4
Showing 8 of 8 rows