Standard Downstream Benchmarks

Benchmarks

Task Name	Dataset Name	SOTA Result	Trend
Natural Language Understanding and Reasoning	Standard Downstream Benchmarks Two-Shot (val)	ARC-E Accuracy (Normalized)56.86		11

Showing 1 of 1 rows