Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Common Sense Reasoning Tasks

Benchmarks

Task NameDataset NameSOTA ResultTrend
Commonsense ReasoningCommon Sense Reasoning Tasks
Avg Score93
241
Common Sense ReasoningCommon Sense Reasoning Tasks (ARC-C, ARC-E, BoolQ, HellaSwag, PIQA, WinoGrande) zero-shot
Average Accuracy (Zero-Shot)74.19
72
Common-sense Reasoning5 common-sense reasoning tasks Llama-3-8B
Average Accuracy87.07
27
Common-sense ReasoningCommon-sense reasoning tasks (ARC-C, ARC-E, HellaSwag, Lambada, PIQA, WinoGrande) (test)
ARC-C Accuracy44.88
16
Common-sense Reasoning5 common-sense reasoning tasks Llama-2-70B
Average Accuracy72.41
15
Common-sense Reasoning5 common-sense reasoning tasks Llama-2-13B
Accuracy67.81
15
Common-sense Reasoning5 common-sense reasoning tasks Llama-3-70B
Average Accuracy75.33
9
Showing 7 of 7 rows