Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MultiNLI

Benchmarks

Task NameDataset NameSOTA ResultTrend
Natural Language InferenceMultiNLI matched (test)
Accuracy85.38
65
Natural Language InferenceMultiNLI Mismatched
Accuracy79.1
60
Natural Language InferenceMultiNLI mismatched (test)
Accuracy81.4
56
Natural Language InferenceMultiNLI Matched
Accuracy80.2
49
Natural Language InferenceMultiNLI mismatched (cross-domain) RepEval 2017 (test)
Accuracy75.8
25
Natural Language InferenceMultiNLI matched (dev)
Accuracy88.4
23
Natural Language InferenceMultiNLI (test)
Accuracy83.7
21
Text ClassificationMultiNLI (test)
WGA81.3
18
Natural Language InferenceMultiNLI matched (in-domain) RepEval 2017 (test)
Accuracy76.8
18
Confidence CalibrationMultiNLI Mismatch (test)
ECE0.0071
16
Natural Language UnderstandingMultiNLI (Match)
ECE1.02
16
Natural Language InferenceMultiNLI mismatched (dev)
Accuracy88.4
11
Natural Language InferenceMultiNLI matched/mismatched
Accuracy92.6
10
Natural Language InferenceMultiNLI matched (in-domain)
Accuracy74.6
8
Natural Language InferenceMultiNLI matched (val)
Accuracy91.7
8
Natural Language InferenceMultiNLI WILDS (test)
IID Accuracy82.1
6
Natural Language InferenceMultiNLI (val)
Accuracy73.17
5
Natural Language InferenceMultiNLI
Accuracy82.4
3
Showing 18 of 18 rows