Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ARC, Winogrande, HellaSwag, PIQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Zero-shot ReasoningARC-e, Winogrande, HellaSwag, PIQA
Normalized Avg Accuracy77.2
36
Showing 1 of 1 rows