SICK

Benchmarks

Task Name	Dataset Name	SOTA Result
Natural Language Inference	SICK	Accuracy91.58	85
Sentence Similarity	SICK	Spearman Correlation70.82	56
Semantic Relatedness	SICK 2014 (test)	Pearson's r0.884	56
Semantic Textual Similarity	SICK Slovak (val)	Pearson Correlation0.778	33
Semantic Textual Similarity	SICK-R (test)	Similarity Score60.74	30
Outlier Detection	Sick	AP (%)36.3	22
Semantic Similarity	SICK	Accuracy652.1	21
Classification	Sick (test)	Accuracy98.94	21
Textual Entailment	SICK (test)	Accuracy90.3	21
Sentence Relatedness	SICK (test + train)	Spearman Correlation0.61	21
Semantic Textual Similarity	SICK-R	Spearman Rho (x100)72.56	16
Classification	Sick	F1 Score88.71	15
Natural Language Entailment	SICK-E	Spearman Rho (x100)71.26	12
Semantic Textual Similarity	SICK (test)	Spearman Correlation0.7669	12
Semantic Relatedness	SICK	Pearson r0.868	12
Outlier Detection	Sick	AUC0.918	11
Outlier Detection	Sick	AUC-ROC90.2	11
Outlier Detection	Sick	AUC-PR0.355	11
Imbalanced Classification	Sick	Macro F189.63	8
Sentence Relatedness	SICK (test)	Pearson Correlation (r)0.8695	7
Semantic Similarity	SICK-R (test)	Semantic Consistency (spring)67.03	5
Natural Language Inference Explanation Evaluation	SICK (sample)	Average Score95.63	4
Semantic Textual Similarity	SICK	Pearson Correlation0.915	4
Semantic Relatedness	SICK (test)	MSE0.233	4
Regression	SICK-R	Spearman Correlation86.54	3

Showing 25 of 29 rows