Short-context benchmarks

Benchmarks

Task Name	Dataset Name	SOTA Result	Trend
Question Answering and Commonsense Reasoning	Short-context benchmarks ARC-C, ARC-E, PIQA, Winogrande, HellaSwag	ARC-C Accuracy63.48		45
Multiple Choice Question Answering and Reasoning	Short Context Benchmarks MMLU, SciQ, OQA, CQA, SIQA, PIQA, HellaSwag, WinoGrande, ARC-c, ARC-e	MMLU Accuracy74.25		10

Showing 2 of 2 rows