| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Coreference Resolution | WSC | Accuracy98.5 | 96 | |
| Classification | WSC | Accuracy80 | 41 | |
| Language | WSC | Score86.51 | 30 | |
| Pronoun Disambiguation | WSC (test) | Accuracy (Single)78.8 | 14 | |
| Coreference Resolution | WSC | Accuracy65.4 | 13 | |
| Commonsense Reasoning | WSC | Accuracy80.6 | 12 | |
| Coreference Resolution | WSC (test) | Accuracy82.7 | 11 | |
| Winograd Schema Challenge | WSC | Accuracy43.3 | 8 | |
| Coreference Resolution | WSC SuperGLUE (test) | Accuracy (Test)65.65 | 8 | |
| Coreference Resolution | WSC standard (test) | Accuracy56.7 | 8 | |
| Coreference Resolution | WSC | F1 Score58.36 | 7 | |
| Coreference Resolution | WSC | Accuracy (0-shot)75.7 | 6 | |
| Coreference Resolution | WSC (dev) | Accuracy85.6 | 6 | |
| Coreference Resolution | WSC273 | Accuracy82.8 | 5 |