Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OmniMath

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningOMNIMATH random subset of 128 samples
Top-1 Accuracy10.93
12
Mathematical ReasoningOmniMath (test)
Top-1 Accuracy0.446
8
Mathematical ReasoningOmniMath (train)
Training Dataset (%)50.1
3
Showing 3 of 3 rows