Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Lego-MT: Learning Detachable Models for Massively Multilingual Machine Translation

About

Multilingual neural machine translation (MNMT) aims to build a unified model for many language directions. Existing monolithic models for MNMT encounter two challenges: parameter interference among languages and inefficient inference for large models. In this paper, we revisit the classic multi-way structures and develop a detachable model by assigning each language (or group of languages) to an individual branch that supports plug-and-play training and inference. To address the needs of learning representations for all languages in a unified space, we propose a novel efficient training recipe, upon which we build an effective detachable model, Lego-MT. For a fair comparison, we collect data from OPUS and build a translation benchmark covering 433 languages and 1.3B parallel data. Experiments show that Lego-MT with 1.2B parameters brings an average gain of 3.2 spBLEU. It even outperforms M2M-100 with 12B parameters. The proposed training recipe brings a 28.2$\times$ speedup over the conventional multi-way training method.\footnote{ \url{https://github.com/CONE-MT/Lego-MT}.}

Fei Yuan, Yinquan Lu, WenHao Zhu, Lingpeng Kong, Lei Li, Yu Qiao, Jingjing Xu• 2022

Related benchmarks

TaskDatasetResultRank
Machine TranslationFLORES-101 (devtest)--
30
Machine TranslationFlores-101 (test)
Average Score3.44e+3
24
Machine TranslationUnseen directions Ceb to African languages zero-shot M2M-100 baseline comparison
Score (Ceb->Ha)12.5
4
Machine TranslationUnseen directions African languages to Ceb M2M-100 baseline comparison zero-shot
Ha→Ceb Score12.3
4
Machine TranslationUnseen directions Low-resource to High-resource X-directions zero-shot X = {En, De, Zh, Ar}
Zero-Shot Score (Asturian)15.5
4
Machine TranslationUnseen directions High-resource to Low-resource X-directions zero-shot X = {En, De, Zh, Ar}
Quality (X->Ast)15.4
4
Showing 6 of 6 rows

Other info

Code

Follow for update