Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

IMO-AnswerBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningIMO-AnswerBench
Accuracy84.5
20
Mathematical ReasoningIMO-AnswerBench
Pass@183.3
12
MathIMO-ANSWERBENCH
Score53.8
9
Reasoning & GeneralIMO-AnswerBench
Score86.3
7
Mathematical ReasoningIMO-AnswerBench
Pass@125.62
6
Mathematical ReasoningIMO-AnswerBench (test)
Pass@125.62
4
Showing 6 of 6 rows