| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| CoNLL English 2012 (test) | MUC F1 Score88 | 114 | 2d ago | ||
| WSC | UPA | Accuracy98.5 | 96 | 3d ago | |
| GAP (test) | longdoc | Overall F189.9 | 53 | 3d ago | |
| Winogrande | GPT-3 | Accuracy73.2 | 36 | 3d ago | |
| Winograd WSC273 (test) | Fine-tuned SOTA | Accuracy90.1 | 34 | 3d ago | |
| CoreRes | Accuracy94.79 | 33 | 3d ago | ||
| LitBank 1.0 (test) | longdoc | CoNLL F181 | 27 | 3d ago | |
| OntoNotes | GPT-4 | MUC93.7 | 23 | 3d ago | |
| English OntoNotes 5.0 (test) | MUC Precision88.6 | 18 | 3d ago | ||
| CoNLL 2012 | Average F183.1 | 17 | 2d ago | ||
| XWinograd | BYOL-nya | Accuracy70.37 | 15 | 2d ago | |
| LitBank 1.0 (dev) | U-MEM | CoNLL F180.5 | 15 | 3d ago | |
| CLUEWSC | GLM-5 Base | EM84.2 | 14 | 3d ago | |
| Winogrande XL | T0-11B | Accuracy60.5 | 13 | 3d ago | |
| WSC | KiC-Large | Accuracy65.4 | 13 | 3d ago | |
| OntoNotes 5.0 (dev) | Wu et al. | CoNLL F183.4 | 13 | 3d ago | |
| WikiCoref (WC) (test) | ImCoref-CeS_gpt4 | Average F173.2 | 12 | 3d ago | |
| WSC (test) | PromptAgent | Accuracy82.7 | 11 | 3d ago | |
| Winogender (test) | GLM-130B | Accuracy80.7 | 11 | 3d ago | |
| OntoNotes v1 (test) | Wu et al. | CoNLL F183.1 | 11 | 3d ago | |
| OntoNotes v1 (dev) | Wu et al. | CoNLL F183.4 | 11 | 3d ago | |
| Winogender (WG) (test) | CorefBERT | Accuracy80.8 | 11 | 3d ago | |
| CoNLL Chinese 2012 (test) | Link-Append | Average F1 Score74.3 | 11 | 3d ago | |
| LitBank (test) | Avg. F181.5 | 10 | 3d ago | ||
| CRAC 2018 (test) | HYBRID | MUC Precision77.9 | 9 | 3d ago |