| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Pronoun Disambiguation | Winograd Schema Challenge | Accuracy90.1 | 27 | |
| Commonsense Reasoning | Winograd Schema Challenge (WSC) (test) | Accuracy75.1 | 17 | |
| Commonsense Reasoning | Hebrew Winograd Schema Challenge | Accuracy83.45 | 11 | |
| Common Sense Reasoning | Winograd Schema Challenge 273 sentences (original) | Accuracy61.5 | 8 |