Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SICK

Benchmarks

Task NameDataset NameSOTA ResultTrend
Natural Language InferenceSICK
Accuracy91.58
85
Sentence SimilaritySICK
Spearman Correlation70.82
56
Semantic RelatednessSICK 2014 (test)
Pearson's r0.884
56
Semantic Textual SimilaritySICK Slovak (val)
Pearson Correlation0.778
33
Semantic Textual SimilaritySICK-R (test)
Similarity Score60.74
30
Outlier DetectionSick
AP (%)36.3
22
Semantic SimilaritySICK
Accuracy652.1
21
ClassificationSick (test)
Accuracy98.94
21
Textual EntailmentSICK (test)
Accuracy90.3
21
Sentence RelatednessSICK (test + train)
Spearman Correlation0.61
21
Semantic Textual SimilaritySICK-R
Spearman Rho (x100)72.56
16
ClassificationSick
F1 Score88.71
15
Natural Language EntailmentSICK-E
Spearman Rho (x100)71.26
12
Semantic Textual SimilaritySICK (test)
Spearman Correlation0.7669
12
Semantic RelatednessSICK
Pearson r0.868
12
Outlier DetectionSick
AUC0.918
11
Outlier DetectionSick
AUC-ROC90.2
11
Outlier DetectionSick
AUC-PR0.355
11
Imbalanced ClassificationSick
Macro F189.63
8
Sentence RelatednessSICK (test)
Pearson Correlation (r)0.8695
7
Semantic SimilaritySICK-R (test)
Semantic Consistency (spring)67.03
5
Natural Language Inference Explanation EvaluationSICK (sample)
Average Score95.63
4
Semantic Textual SimilaritySICK
Pearson Correlation0.915
4
Semantic RelatednessSICK (test)
MSE0.233
4
RegressionSICK-R
Spearman Correlation86.54
3
Showing 25 of 29 rows