Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Commonsense Reasoning Tasks

Benchmarks

Task NameDataset NameSOTA ResultTrend
Commonsense ReasoningCommonsense Reasoning Tasks (ARC-e, OBQA, SIQA, ARC-c, WinoG, PIQA, BoolQ, HellaS) LLaMA3-8B
ARC-e Accuracy92.9
13
Showing 1 of 1 rows