Tower+: Bridging Generality and Translation Specialization in Multilingual LLMs
About
Fine-tuning pretrained LLMs has been shown to be an effective strategy for reaching state-of-the-art performance on specific tasks like machine translation. However, this process of adaptation often implies sacrificing general-purpose capabilities, such as conversational reasoning and instruction-following, hampering the utility of the system in real-world applications that require a mixture of skills. In this paper, we introduce Tower+, a suite of models designed to deliver strong performance across both translation and multilingual general-purpose text capabilities. We achieve a Pareto frontier between translation specialization and multilingual general-purpose capabilities by introducing a novel training recipe that builds on Tower (Alves et al., 2024), comprising continued pretraining, supervised fine-tuning, preference optimization, and reinforcement learning with verifiable rewards. At each stage of training, we carefully generate and curate data to strengthen performance on translation as well as general-purpose tasks involving code generation, mathematics problem solving, and general instruction-following. We develop models at multiple scales: 2B, 9B, and 72B. Our smaller models often outperform larger general-purpose open-weight and proprietary LLMs (e.g., Llama 3.3 70B, GPT-4o). Our largest model delivers best-in-class translation performance for high-resource languages and top results in multilingual Arena Hard evaluations and in IF-MT, a benchmark we introduce for evaluating both translation and instruction-following. Our findings highlight that it is possible to rival frontier models in general capabilities, while optimizing for specific business domains, such as translation and localization.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Machine Translation | FLORES+ (test) | spBLEU45.32 | 128 | |
| Machine Translation | WMT24++ v1.0 (test) | XCOMET Score88.19 | 49 | |
| Machine Translation (xx -> zh) | FLORES+ latest (test) | spBLEU33.25 | 30 | |
| Machine Translation | WMT 2025 (test) | XCOMET-XXL41 | 17 | |
| Machine Translation | FLORES-200 EN ⇔ XX 2022 | XCOMET-XXL84.16 | 17 | |
| Machine Translation | FLORES-200 ZH ⇔ XX 2022 | XCOMET-XXL0.7969 | 17 | |
| Machine Translation | FLORES-200 XX ⇔ XX 2022 | XCOMET-XXL70.02 | 17 | |
| Machine Translation | Mandarin ⇔ Minority (test) | XCOMET-XXL0.3855 | 16 |