Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Tower: An Open Multilingual Large Language Model for Translation-Related Tasks

About

While general-purpose large language models (LLMs) demonstrate proficiency on multiple tasks within the domain of translation, approaches based on open LLMs are competitive only when specializing on a single task. In this paper, we propose a recipe for tailoring LLMs to multiple tasks present in translation workflows. We perform continued pretraining on a multilingual mixture of monolingual and parallel data, creating TowerBase, followed by finetuning on instructions relevant for translation processes, creating TowerInstruct. Our final model surpasses open alternatives on several tasks relevant to translation workflows and is competitive with general-purpose closed LLMs. To facilitate future research, we release the Tower models, our specialization dataset, an evaluation framework for LLMs focusing on the translation ecosystem, and a collection of model generations, including ours, on our benchmark.

Duarte M. Alves, Jos\'e Pombal, Nuno M. Guerreiro, Pedro H. Martins, Jo\~ao Alves, Amin Farajian, Ben Peters, Ricardo Rei, Patrick Fernandes, Sweta Agrawal, Pierre Colombo, Jos\'e G.C. de Souza, Andr\'e F.T. Martins• 2024

Related benchmarks

TaskDatasetResultRank
Machine TranslationFLORES xx→en (test)
Score (de→en)-29.15
38
TranslationFLORES-200 it-en (devtest)
sacreBLEU35.6008
35
Machine TranslationNTREX (en->it) 128 (test)
sacreBLEU41.7372
35
Machine TranslationNTREX it->en 128 (test)
sacreBLEU45.7063
35
Machine TranslationWikinews-25 it->en
sacreBLEU43.9073
35
TranslationFLORES-200 en-it (devtest)
sacreBLEU30.4748
35
Machine TranslationWikinews-25 en->it
sacreBLEU44.7598
35
Machine TranslationTatoeba it->en
sacreBLEU68.7636
33
Machine TranslationTatoeba en->it
sacreBLEU52.5356
33
Machine TranslationFlores-200 (test)
xCOMET (DE)97.69
22
Showing 10 of 39 rows

Other info

Follow for update