Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Common Sense Reasoning Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Common Sense ReasoningCommon Sense Reasoning Suite (PIQA, ARC-Easy, ARC-Challenge, BoolQ, HellaSwag, Winogrande) zero-shot (test dev)
PIQA79.35
30
Common Sense ReasoningCommon Sense Reasoning Suite (PIQA, HellaSwag, WinoGrande, ARC-e, ARC-c, OBQA) zero-shot
PIQA81.12
20
Common-sense ReasoningCommon-sense Reasoning Suite (PIQA, HellaSwag, ARC-C, ARC-E, OBQA) (test)
PIQA Accuracy84.1
18
Common Sense ReasoningCommon Sense Reasoning Suite ARC, BoolQ, RTE, Winogrande, TruthfulQA
ARC Challenge Accuracy34
8
Zero-shot task classificationCommon Sense Reasoning Suite (PIQA, HellaSwag, WSC, BoolQ, RACE-H) zero-shot
PIQA71.22
5
Showing 5 of 5 rows