Towards Continual Learning for Multilingual Machine Translation via Vocabulary Substitution

About

We propose a straightforward vocabulary adaptation scheme to extend the language capacity of multilingual machine translation models, paving the way towards efficient continual learning for multilingual machine translation. Our approach is suitable for large-scale datasets, applies to distant languages with unseen scripts, incurs only minor degradation on the translation performance for the original language pairs and provides competitive performance even in the case where we only possess monolingual data for the new languages.

Xavier Garcia, Noah Constant, Ankur P. Parikh, Orhan Firat• 2021

Related benchmarks

Task	Dataset	Result
Machine Translation	TED data (test)	ChrF++62.4	94
Machine Translation	TED data (test)	ChrF++62.7	12
Machine Translation	TED (test)	RU-EN Score27.4	12

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord