Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

GECToR -- Grammatical Error Correction: Tag, Not Rewrite

About

In this paper, we present a simple and efficient GEC sequence tagger using a Transformer encoder. Our system is pre-trained on synthetic data and then fine-tuned in two stages: first on errorful corpora, and second on a combination of errorful and error-free parallel corpora. We design custom token-level transformations to map input tokens to target corrections. Our best single-model/ensemble GEC tagger achieves an $F_{0.5}$ of 65.3/66.5 on CoNLL-2014 (test) and $F_{0.5}$ of 72.4/73.6 on BEA-2019 (test). Its inference speed is up to 10 times as fast as a Transformer-based seq2seq GEC system. The code and trained models are publicly available.

Kostiantyn Omelianchuk, Vitaliy Atrasevych, Artem Chernodub, Oleksandr Skurzhanskyi• 2020

Related benchmarks

TaskDatasetResultRank
Grammatical Error CorrectionCoNLL 2014 (test)
F0.5 Score66.5
207
Grammatical Error CorrectionBEA shared task 2019 (test)
F0.5 Score73.7
139
Grammatical Error CorrectionMuCGEC (test)
Precision46.72
34
Grammatical Error CorrectionBEA 2019 (dev)
F0.5 Score55.62
19
Grammatical Error CorrectionFCGEC (test)
Precision46.11
17
Grammatical Error CorrectionBEA 2019 (test)
F0.572.4
12
Morph ResolutionLiveAMR (Test2)
Acc70.2
9
Morph ResolutionLiveAMR (test1)
Accuracy65.1
9
Grammatical Error CorrectionFCGEC
EM15.66
9
Showing 9 of 9 rows

Other info

Code

Follow for update