Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Downstream Evaluation Suite (ARC, PIQA, HellaSwag, WinoGrande, LAMBADA, RACE)

Benchmarks

Task NameDataset NameSOTA ResultTrend
Zero-shot Downstream EvaluationDownstream Evaluation Suite (ARC, PIQA, HellaSwag, WinoGrande, LAMBADA, RACE) Zero-shot
ARC-E Accuracy56.57
14
Showing 1 of 1 rows