Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Omni-MATH

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningOmni-MATH
Accuracy66.9
68
Mathematical ReasoningOmni-Math
Average Score @825.34
14
MathOmni-MATH
Score54.1
10
Reasoning Episode ClassificationOmni-MATH human-annotated Reasoning episodes (gold set)
Accuracy86.33
8
Mathematical & Symbolic ReasoningOmni-MATH Tier 2
Success Rate (SR)42.7
6
Difficulty Correlation with LLM PerformanceOmni-Math
Pearson PCC0.91
4
Reasoning Episode ClassificationOmni-MATH Non-Reasoning episodes (human-annotated gold set)
Accuracy89.34
4
Difficulty Correlation with Human LabelsOmni-Math n=1876
Pearson Correlation0.82
2
Showing 8 of 8 rows