Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

NLU Benchmark Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Zero-shot Natural Language UnderstandingNLU Benchmark Suite CMNLI, HeSW, PIQA, WSC, CoQA, BoolQ, Race-M, Race-H, XSum, C3
CMNLI Accuracy34.43
8
Natural Language UnderstandingNLU Benchmark Suite (SST2, COPA, CB, BoolQ, RTE, WiC) Pangu-1B (NPU) (val)
SST2 Accuracy82.1
6
Showing 2 of 2 rows