Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ANLI

Benchmarks

Task NameDataset NameSOTA ResultTrend
Natural Language InferenceANLI
Accuracy74.02
65
Natural Language InferenceANLI Round 3
Accuracy67.9
64
Natural Language InferenceANLI Round 2
Accuracy66.5
64
Natural Language InferenceANLI Round 1
Accuracy77
57
Abductive Commonsense ReasoningaNLI (test)
Accuracy92.9
53
Abductive Natural Language InferenceaNLI (leaderboard)
Accuracy93.2
47
Commonsense ReasoningaNLI
Accuracy87.3
35
Natural Language InferenceANLI R3 1.0 (test)
Weighted F134.9
28
Natural Language InferenceANLI R2 1.0 (test)
Weighted F10.331
28
Natural Language InferenceANLI R1 1.0 (test)
Weighted F141.1
28
Natural Language InferenceANLI (test)
Overall Score92.2
28
Natural Language InferenceANLI
ANLI R1 Accuracy73.1
27
Natural Language InferenceANLI R3 (test)
Accuracy44.7
26
Natural Language InferenceANLI R1 (test)
Accuracy44.3
26
Natural Language InferenceANLI R2
Accuracy81.14
24
Natural Language InferenceANLI (val)
Accuracy73.37
21
Natural Language InferenceANLI R2 (test)
Accuracy33.1
20
Natural Language InferenceANLI Round 2 (test)
Accuracy51.4
14
Natural Language InferenceANLI (dev)
R1 Score76.4
13
Natural Language InferenceANLI
Best F1 (ANLI)75
8
Natural Language InferenceANLI R3
Accuracy46.67
8
Natural Language InferenceANLI R1
Accuracy47.3
8
Natural Language InferenceANLI MNLI + SNLI trained (test)
ANLI A1 Score50
8
Natural Language InferenceANLI MNLI + SNLI trained (dev)
Accuracy (A1)50.4
8
Natural Language InferenceANLI R3
Accuracy67.1
7
Showing 25 of 41 rows