Edit-level Majority Voting Mitigates Over-Correction in LLM-based Grammatical Error Correction

About

Grammatical error correction using large language models often suffers from the over-correction issue. To mitigate this, we propose a training-free inference method that performs edit-level majority voting over multiple candidates generated by a single model, without requiring model modifications or additional training. Across nine benchmarks covering English, Czech, German, Ukrainian, Korean, Hindi, and Romanian, the proposed method outperforms both greedy and MBR decoding in most cases. Moreover, it yields stable correction quality regardless of the instruction prompts used. We release two repository supporting GEC datasets loading and LLM inference.

Takumi Goto, Yusuke Sakai, Taro Watanabe• 2026

Related benchmarks

Task	Dataset	Result
Grammatical Error Correction	JFLEG (test)	GLEU62.9	60
Grammatical Error Correction	BEA 2019 (test)	F0.574.6	47
Grammatical Error Correction	CWEB-G (test)	Precision44.7	15
Grammatical Error Correction	AKCES-GEC	Precision78.4	9
Grammatical Error Correction	Falko-Merlin	Precision64.5	9
Grammatical Error Correction	UNLP 2023	Precision50.4	9
Grammatical Error Correction	Kor-learner	Precision50.9	9
Grammatical Error Correction	Hi-GEC	GLEU71.4	9
Grammatical Error Correction	RONACC	GLEU87.1	9

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord