| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Natural Language Inference | XNLI (test) | Average Accuracy90 | 167 | |
| Natural Language Inference | XNLI | Accuracy87.1 | 111 | |
| Zero-Shot Cross-Lingual Transfer | XNLI | Pearson Correlation0.9639 | 48 | |
| Natural Language Inference | XNLI 1.0 (test) | Accuracy84.2 | 38 | |
| Natural Language Inference | XNLI (dev) | Accuracy82.7 | 24 | |
| Text Classification | XNLI (test) | Accuracy (Max)81.3 | 20 | |
| Natural Language Inference | XNLI Sw (test) | Accuracy65.34 | 18 | |
| Natural Language Inference | XNLI Sw (dev) | Accuracy65.68 | 18 | |
| Natural Language Inference | XNLI Ur (test) | Accuracy0.6518 | 18 | |
| Natural Language Inference | XNLI Ur (dev) | Accuracy66.43 | 18 | |
| Natural Language Inference | XNLI Hi (test) | Accuracy71.65 | 18 | |
| Natural Language Inference | XNLI Hi (dev) | Accuracy71.55 | 18 | |
| Zero-shot performance prediction | XNLI | MAE1.53 | 18 | |
| Natural Language Inference | XNLI French (test) | Accuracy85.7 | 16 | |
| Natural Language Inference | XNLI 2.0 | Accuracy45.21 | 15 | |
| Sentence-pair classification | XNLI 1.1 (test) | Accuracy (EN)67.97 | 14 | |
| Sentence Pair Classification | XNLI Chinese portion (test) | Accuracy81.3 | 9 | |
| Sentence Pair Classification | XNLI Chinese portion (dev) | Accuracy82.4 | 9 | |
| Multilingual Natural Language Inference | XNLI (test) | Accuracy (EN)86 | 8 | |
| Zero-shot Cross-lingual Transfer | XNLI (test) | Pearson Correlation0.9377 | 8 | |
| Natural Language Inference | XNLI French | Accuracy59.1 | 6 | |
| Sentence-pair classification | XNLI | Accuracy79.2 | 4 | |
| Natural Language Inference | XNLI German (test) | Accuracy34.5 | 4 | |
| Cross-lingual Sentence Classification | XNLI Language Transfer (test) | ar71.7 | 3 | |
| Natural Language Inference | XNLI 2.0 78 (test) | Accuracy- | 0 |