| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Semantic Textual Similarity | STS-B | Spearman's Rho (x100)92.9 | 70 | |
| Semantic Similarity | STS-B (test) | Semantic Consistency80.22 | 18 | |
| Imbalanced Regression | STS-B-DIR Few-shot (test) | MSE0.781 | 14 | |
| Imbalanced Regression | STS-B-DIR Medium-shot (test) | MSE0.899 | 14 | |
| Imbalanced Regression | STS-B-DIR Many-shot (test) | MSE0.795 | 14 | |
| Imbalanced Regression | STS-B-DIR All (test) | MSE0.892 | 14 | |
| Regression | STS-B-DIR Few-shot | MSE0.781 | 14 | |
| Regression | STS-B-DIR Medium-shot | MSE0.899 | 14 | |
| Regression | STS-B-DIR Many-shot | MSE0.795 | 14 | |
| Regression | STS-B DIR (All) | MSE0.892 | 14 | |
| Semantic Textual Similarity | STS-B | Accuracy0.595 | 10 | |
| Semantic Textual Similarity | Multilingual STS-B (val) | Spearman Correlation77.48 | 8 | |
| Semantic Textual Similarity | STS-B (dev) | Pearson Correlation0.918 | 6 | |
| Text Similarity Regression | STS-B DIR (test) | MSE (All)0.877 | 6 | |
| Uncertainty Estimation | STS-B DIR Few | NLL2.152 | 5 | |
| Uncertainty Estimation | STS-B-DIR Medium | NLL2.754 | 5 | |
| Uncertainty Estimation | STS-B-DIR Many | NLL1.81 | 5 | |
| Uncertainty Estimation | STS-B DIR (All) | NLL1.996 | 5 | |
| Regression | STS-B (test) | Spearman Corr (%)88.94 | 5 | |
| Intrinsic Bias Evaluation | STS-B | StereoSet Score54.53 | 3 | |
| Sentence Ranking | STS-B | KCC63.64 | 3 | |
| Sentence Retrieval | STS-B (test) | Recall@178.87 | 2 | |
| Rank correlation between domain similarity and representation similarity | Multi STS-B May, 2021 (test) | Spearman's rho0.924 | 2 |