Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Macro-average (MMLU, MATH, GSM8K, BBH)

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question Answering and ReasoningMacro-average (MMLU, MATH, GSM8K, BBH) (test)
Cost Reduction-
0
Showing 1 of 1 rows