Coreference Resolution through a seq2seq Transition-Based System

About

Most recent coreference resolution systems use search algorithms over possible spans to identify mentions and resolve coreference. We instead present a coreference resolution system that uses a text-to-text (seq2seq) paradigm to predict mentions and links jointly. We implement the coreference system as a transition system and use multilingual T5 as an underlying language model. We obtain state-of-the-art accuracy on the CoNLL-2012 datasets with 83.3 F1-score for English (a 2.3 higher F1-score than previous work (Dobrovolskii, 2021)) using only CoNLL data for training, 68.5 F1-score for Arabic (+4.1 higher than previous work) and 74.3 F1-score for Chinese (+5.3). In addition we use the SemEval-2010 data sets for experiments in the zero-shot setting, a few-shot setting, and supervised setting using all available training data. We get substantially higher zero-shot F1-scores for 3 out of 4 languages than previous approaches and significantly exceed previous supervised state-of-the-art results for all five tested languages.

Bernd Bohnet, Chris Alberti, Michael Collins• 2022

Related benchmarks

Task	Dataset	Result
Coreference Resolution	CoNLL English 2012 (test)	MUC F1 Score87.8	114
Coreference Resolution	OntoNotes	MUC87.8	46
Coreference Resolution	English OntoNotes 5.0 (test)	MUC Precision87.4	18
Coreference Resolution	CoNLL Chinese 2012 (test)	Average F1 Score74.3	11
Coreference Resolution	SemEval Spanish 2010 (test)	Avg F183.9	8
Coreference Resolution	SemEval Catalan 2010 (test)	Avg F1 Score83.5	7
Coreference Resolution	SemEval Dutch 2010 (test)	Average F166.6	7
Coreference Resolution	SemEval German 2010 (test)	Avg F186.4	7
Coreference Resolution	SemEval Italian 2010 (test)	Avg F165.9	6
Coreference Resolution	CoNLL Arabic 2012 (test)	MUC Precision71	3

Showing 10 of 10 rows

Other info

Code

Follow for update

@wizwand_team Discord