Iterative Translation Refinement with Large Language Models

About

We propose iteratively prompting a large language model to self-correct a translation, with inspiration from their strong language understanding and translation capability as well as a human-like translation approach. Interestingly, multi-turn querying reduces the output's string-based metric scores, but neural metrics suggest comparable or improved quality. Human evaluations indicate better fluency and naturalness compared to initial translations and even human references, all while maintaining quality. Ablation studies underscore the importance of anchoring the refinement to the source and a reasonable seed translation for quality considerations. We also discuss the challenges in evaluation and relation to human performance and translationese.

Pinzhen Chen, Zhicheng Guo, Barry Haddow, Kenneth Heafield• 2023

Related benchmarks

Task	Dataset	Result
Machine Translation	En-Es document-level	d-COMET86.48	66
Machine Translation	De-En document-level	d-COMET87.73	36
Machine Translation	WMT De-En 22 (test)	COMET86.86	29
Machine Translation	WMT 2023 (test)	COMET87.6	24
Machine Translation	En-Ru document-level	d-COMET85.63	21
Machine Translation	FR-EN	COMET0.8763	21
Machine Translation	Translation Ru-En document-level	d-COMET83.87	18
Machine Translation	document-level translation En-Fr	d-COMET85.06	18
Machine Translation	En-Zh document-level translation	d-COMET82.93	18
Machine Translation	Es-En document-level translation	d-COMET88.23	18

Showing 10 of 40 rows

Other info

Follow for update

@wizwand_team Discord