Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Language Understanding Evaluation Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language Understanding and ReasoningLanguage Understanding Evaluation Suite (Arc-c, Arc-e, BoolQ, COPA, MMLU, OBQA, PIQA, RTE, Winogrande) Zyda2 calibration (test)
ARC-c58.28
6
Showing 1 of 1 rows