Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SICK

Benchmarks

Task NameDataset NameSOTA ResultTrend
Semantic RelatednessSICK 2014 (test)
Pearson's r0.884
56
Semantic Textual SimilaritySICK Slovak (val)
Pearson Correlation0.778
33
Semantic Textual SimilaritySICK-R (test)
Similarity Score60.74
30
Outlier DetectionSick
AP (%)36.3
22
ClassificationSick (test)
Accuracy98.94
21
Textual EntailmentSICK (test)
Accuracy90.3
21
Sentence RelatednessSICK (test + train)
Spearman Correlation0.61
21
Natural Language InferenceSICK
Accuracy90.9
16
ClassificationSick
F1 Score88.71
15
Natural Language EntailmentSICK-E
Spearman Rho (x100)71.26
12
Semantic Textual SimilaritySICK (test)
Spearman Correlation0.7669
12
Semantic RelatednessSICK
Pearson r0.868
12
Outlier DetectionSick
AUC0.918
11
Outlier DetectionSick
AUC-ROC90.2
11
Outlier DetectionSick
AUC-PR0.355
11
Semantic Textual SimilaritySICK-R
Spearman Rho (x100)65.44
11
Sentence RelatednessSICK (test)
Pearson Correlation (r)0.8695
7
Semantic SimilaritySICK-R (test)
Semantic Consistency (spring)67.03
5
Natural Language Inference Explanation EvaluationSICK (sample)
Average Score95.63
4
Semantic Textual SimilaritySICK
Pearson Correlation0.915
4
Semantic RelatednessSICK (test)
MSE0.233
4
RegressionSICK-R
Spearman Correlation86.54
3
Single-label ClassificationSICK-E
Accuracy88.96
3
Sentence RankingSICK-R
KCC57.4
3
Semantic RelatednessSICK filtered 2014 (test)
RMSE0.24
3
Showing 25 of 25 rows