Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Downstream Reasoning Tasks

Benchmarks

Task NameDataset NameSOTA ResultTrend
Zero-shot Question AnsweringDownstream Reasoning Tasks ARC-c, ARC-e, BoolQ, HellaSwag, MMLU, OpenBookQA, PIQA, Winogrande
ARC-c Accuracy (Zero-shot)58.4
15
Showing 1 of 1 rows