GECToR -- Grammatical Error Correction: Tag, Not Rewrite

About

In this paper, we present a simple and efficient GEC sequence tagger using a Transformer encoder. Our system is pre-trained on synthetic data and then fine-tuned in two stages: first on errorful corpora, and second on a combination of errorful and error-free parallel corpora. We design custom token-level transformations to map input tokens to target corrections. Our best single-model/ensemble GEC tagger achieves an $F_{0.5}$ of 65.3/66.5 on CoNLL-2014 (test) and $F_{0.5}$ of 72.4/73.6 on BEA-2019 (test). Its inference speed is up to 10 times as fast as a Transformer-based seq2seq GEC system. The code and trained models are publicly available.

Kostiantyn Omelianchuk, Vitaliy Atrasevych, Artem Chernodub, Oleksandr Skurzhanskyi• 2020

Related benchmarks

Task	Dataset	Result
Grammatical Error Correction	CoNLL 2014 (test)	F0.5 Score66.5	207
Grammatical Error Correction	BEA shared task 2019 (test)	F0.5 Score73.7	139
Grammatical Error Correction	JFLEG (test)	GLEU58.9	60
Grammatical Error Correction	MuCGEC (test)	Precision46.72	34
Grammatical Error Correction	BEA 2019 (test)	F0.573.9	27
Grammatical Error Correction	BEA 2019 (dev)	F0.5 Score55.62	19
Grammatical Error Correction	FCGEC (test)	Precision46.11	17
Grammatical Error Correction	CWEB-G (test)	Precision56.1	15
Morph Resolution	LiveAMR (Test2)	Acc70.2	9
Morph Resolution	LiveAMR (test1)	Accuracy65.1	9

Showing 10 of 11 rows

Other info

Code

Follow for update

@wizwand_team Discord