Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

BENCH-PROXY

Benchmarks

Task NameDataset NameSOTA ResultTrend
Zero-shot Language Understanding and ReasoningBENCH-PROXY (MMLU, ANLI, HellaSwag, PIQA, SIQA, W.G., ARC-E, ARC-C, C.QA, WSC) (test)
MMLU34.32
24
Showing 1 of 1 rows