Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ARC, BoolQ, HellaSwag

Benchmarks

Task NameDataset NameSOTA ResultTrend
Zero-shot Task EvaluationARC-C, ARC-E, BoolQ, and HellaSwag
Accuracy69.48
28
Showing 1 of 1 rows