Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AMO-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningAMO-Bench
Avg@50.646
48
Mathematical ReasoningAMO-Bench
Mean@64 Accuracy11.8
27
Mathematical ReasoningAMO-Bench
Seed (Avg@5)0.56
16
Mathematical ReasoningAMO-Bench
Average@1614.8
12
Mathematical ReasoningAMO-Bench
Pass@836.72
6
Mathematical ReasoningAMO-Bench VeRA-H / VeRA-H Pro
Avg@5 Accuracy (Seeds)31.75
1
Showing 6 of 6 rows