Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AMO-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningAMO-Bench
Avg@50.646
48
Mathematical ReasoningAMO-Bench
Mean@64 Accuracy11.8
27
Mathematical ReasoningAMO-Bench
Accuracy (Acc)20.5
20
Mathematical ReasoningAMO-Bench
Pass@836.72
20
Mathematical ReasoningAMO-Bench
Seed (Avg@5)0.56
16
Mathematical ReasoningAMO-Bench
Average@1614.8
12
Mathematical ReasoningAMO-Bench
Pass@1 Score2.6
8
Mathematical ReasoningAMO-Bench
AMO-Bench Accuracy59.8
6
Mathematical ReasoningAMO-Bench VeRA-H / VeRA-H Pro
Avg@5 Accuracy (Seeds)31.75
1
Showing 9 of 9 rows