Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages
About
We present effective pre-training strategies for neural machine translation (NMT) using parallel corpora involving a pivot language, i.e., source-pivot and pivot-target, leading to a significant improvement in source-target translation. We propose three methods to increase the relation among source, pivot, and target languages in the pre-training: 1) step-wise training of a single model for different language pairs, 2) additional adapter component to smoothly connect pre-trained encoder and decoder, and 3) cross-lingual encoder training via autoencoding of the pivot language. Our methods greatly outperform multilingual models up to +2.6% BLEU in WMT 2019 French-German and German-Czech tasks. We show that our improvements are valid also in zero-shot/zero-resource scenarios.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Date Understanding | Date Understanding FLORES-200 10-languages | Performance (kaz_Cyrl)49.6 | 14 | |
| Date Understanding | FLORES-200 10 low-resourced languages | Performance Score (kaz_Cyrl)20.8 | 7 | |
| Math Word Problem Solving | SVAMP 10 low-resourced languages FLORES-200 (test) | Kazakh (Cyrillic) Accuracy35 | 7 | |
| Mathematical Reasoning | GSM8K FLORES-200 (10 low-resourced languages) (test) | Kazakh (Cyrl) Accuracy23.65 | 7 | |
| Mathematical Reasoning | SVAMP 10 low-resourced languages FLORES-200 | Kazakh (Cyrl) Score4 | 7 | |
| Math Reasoning | SVAMP | Kazakh (Cyrl) Accuracy54.67 | 7 | |
| Sports Understanding | Sports Understanding 10 low-resourced languages FLORES-200 | Kazakh (Cyrl) Score48.4 | 7 |