Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Large-scale model pool

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language GenerationLarge-scale model pool 15 LLMs
Accuracy51.9
3
Math ReasoningLarge-scale model pool Math Reasoning 15 LLMs
Accuracy73.3
3
Logic ReasoningLarge-scale model pool Logic Reasoning 15 LLMs
Accuracy95.6
3
Reading & QALarge-scale model pool Reading&QA 15 LLMs
Accuracy88
3
Language UnderstandingLarge-scale model pool Language Understanding
Accuracy84
3
Showing 5 of 5 rows