| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| SimLex | Concepts | Spearman Correlation0.1844 | 60 | 17d ago | |
| Semantic Similarity Cross-lingual XL | VMSST | Pearson Correlation Coefficient0.791 | 24 | 1mo ago | |
| STS-B (test) | SimCSE-ROBERTa | Semantic Consistency80.22 | 18 | 1mo ago | |
| BQ syntactically perturbed | Accuracy79.26 | 17 | 16d ago | ||
| AFQMC syntactically perturbed | SyCo | Accuracy94.27 | 17 | 16d ago | |
| LCQMC | SyCo | Accuracy83.6 | 17 | 16d ago | |
| WIKI (test) | Llama3-8B-Instruct | BLEU-455.52 | 17 | 1mo ago | |
| ICEWS18 (test) | Llama3-8B-Instruct | BLEU-440.39 | 17 | 1mo ago | |
| Semantic Similarity Cross-lingual same language XL s. | VMSST | Pearson's r0.815 | 12 | 1mo ago | |
| Semantic Similarity English-only | VMSST | Pearson's r74.6 | 12 | 1mo ago | |
| USEB (Universal Sentence Encoder Benchmark) | SGPT-2.7B | AskU AP57.5 | 12 | 1mo ago | |
| Crisscrossed Captions (CxC) | DE-T2T+I2T | Mean Average74.5 | 10 | 1mo ago | |
| π-YALLI corpus Nawatl | FastText Skipgram | Kendall's Tau (1x)0.459 | 8 | 9d ago | |
| SICK-R (test) | rematch | Semantic Consistency (spring)67.03 | 5 | 1mo ago | |
| Simulated Music Recommendation Conversations | MuseChat | BertScore F10.9676 | 3 | 1mo ago | |
| Abg-CoQA | IntentRL | Similarity83 | 2 | 1mo ago |