Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MNLI

Benchmarks

Task NameDataset NameSOTA ResultTrend
Natural Language InferenceMNLI (matched)
Accuracy91.7
110
Natural Language InferenceMNLI
Accuracy (matched)90.8
80
Natural Language InferenceMNLI (mismatched)
Accuracy91
68
Natural Language InferenceMNLI (test)
Accuracy0.898
48
Backdoor DefenseMNLI (test)
Clean Accuracy91.8
44
Natural Language InferenceMNLI (dev)
Acc (m)90.2
44
Natural Language InferenceMNLI
Accuracy86.2
36
Text ClassificationMNLI
Accuracy87.45
32
ClassificationMNLI (val)
Accuracy84.17
32
Natural Language InferenceMNLI mm
Accuracy90.7
30
Natural Language InferenceMNLI (val)
Accuracy92.13
26
Natural Language InferenceMNLI
Accuracy87.98
22
Natural Language InferenceMNLI few-shot zero-shot
Accuracy71.1
16
Structural Bias EvaluationMNLI
Accuracy98.1
14
Natural Language InferenceMNLI
Normalized Accuracy96.2
13
Natural Language InferenceMNLI-m
Accuracy77.2
13
Natural Language InferenceMNLI Unknown Bias (in-distribution)
Accuracy84.2
13
Natural Language InferenceMNLI HardSP (challenge)
Accuracy83.2
13
Natural Language InferenceMNLI HardCD (challenge)
Accuracy0.803
13
Natural Language InferenceMNLI Hypothesis-only Bias (in-distribution)
Accuracy84.2
13
Natural Language InferenceMNLI Syntactic Bias (in-distribution)
Accuracy84.3
13
Natural Language InferenceMNLI (all combined)
Accuracy85.98
12
Natural Language InferenceMNLI-m (dev)
Accuracy90.6
12
Hallucination DetectionMNLI (test)
AuROC100
10
Natural Language InferenceMNLI
Accuracy89.5
10
Showing 25 of 68 rows