BERT for Coreference Resolution: Baselines and Analysis
About
We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3.9 F1) and GAP (+11.5 F1) benchmarks. A qualitative analysis of model predictions indicates that, compared to ELMo and BERT-base, BERT-large is particularly better at distinguishing between related but distinct entities (e.g., President and CEO). However, there is still room for improvement in modeling document-level context, conversations, and mention paraphrasing. Our code and models are publicly available.
Mandar Joshi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer• 2019
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Coreference Resolution | CoNLL English 2012 (test) | MUC F1 Score83.5 | 114 | |
| Coreference Resolution | GAP (test) | Overall F185 | 53 | |
| Coreference Resolution | English OntoNotes 5.0 (test) | -- | 18 | |
| Coreference Resolution | CoNLL 2012 | Average F176.9 | 17 | |
| Coreference Resolution | OntoNotes 5.0 (dev) | CoNLL F180.1 | 13 | |
| Coreference Resolution | STM corpus five-fold cross validation (test) | MUC Precision61.6 | 6 |
Showing 6 of 6 rows