Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Common Sense QA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Commonsense ReasoningCommon Sense QA (test)
ARC-C Accuracy (5-shot)58
20
Commonsense ReasoningCommon sense QA
AUCOAA81.4
11
Retention UtilityCommon Sense QA
AR Score95.4
7
Showing 3 of 3 rows