Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Understanding Tasks

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language UnderstandingGeneral Understanding Tasks ARC-E, BoolQ, Wino., PIQA, HellaSwag, TruthfulQA, OBQA, LogiQA
ARC-E Accuracy64.1
8
General UnderstandingGeneral Understanding Tasks Chinese (ZH) Translated
Accuracy45.92
3
General UnderstandingGeneral Understanding Tasks Japanese (JA) Translated
Average Accuracy44
3
General UnderstandingGeneral Understanding Tasks French (FR) Translated
Avg Accuracy47.52
3
General UnderstandingGeneral Understanding Tasks German (DE) Translated
Accuracy47.16
3
General UnderstandingGeneral Understanding Tasks English (EN) Translated
Avg Accuracy61.07
2
Showing 6 of 6 rows