Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MMLU-CoT, GSM8k, HellaSwag, WinoGrande

Benchmarks

Task NameDataset NameSOTA ResultTrend
Zero-shot Language Modeling and ReasoningMMLU-CoT, GSM8k, HellaSwag, WinoGrande zero-shot Llama-3.1-8B-Instruct
MMLU-CoT Accuracy72.76
30
Showing 1 of 1 rows