Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Knowledge Evaluation Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
General Knowledge EvaluationGeneral Knowledge Evaluation Suite (ARC, HellaSwag, LAMBADA, PIQA, SciQ, WinoGrande, TriviaQA, WebQS, MMLU, GSM8K)
ARC-C60.2
5
Showing 1 of 1 rows