Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Natural Language Inference on SNLI (Statistical Measures)
Loading...
83.05
Correlation Coefficient
BERTScore
-69.9444
-30.2247
9.495
49.2147
May 26, 2026
Correlation Coefficient
Incorrectness Score
N Delta
Updated 7d ago
Evaluation Results
Method
Method
Links
Correlation Coefficient
Incorrectness Score
N Delta
BERTScore
2026.05
83.05
79.34
3.71
MATCHA
rescaled to [0, 1]=true
2026.05
71.14
1.24
34.95
SimCSE
rescaled to [0, 1]=true
2026.05
69.47
33.43
18.03
EmbSim
rescaled to [0, 1]=true
2026.05
65.7
33.36
16.17
MAUVE
2026.05
41.5
47.67
-6.17
R1-F1
2026.05
41.34
28.58
12.76
RL-F1
2026.05
38.24
26.44
11.8
METEOR
2026.05
30.08
20.78
9.3
R2-F1
2026.05
19.07
10.26
8.81
BLEURT
rescaled to [0, 1]=true
2026.05
-64.06
-99.63
17.79
Feedback
Search any
task
Search any
task