EuroLLM: Multilingual Language Models for Europe
About
The quality of open-weight LLMs has seen significant improvement, yet they remain predominantly focused on English. In this paper, we introduce the EuroLLM project, aimed at developing a suite of open-weight multilingual LLMs capable of understanding and generating text in all official European Union languages, as well as several additional relevant languages. We outline the progress made to date, detailing our data collection and filtering process, the development of scaling laws, the creation of our multilingual tokenizer, and the data mix and modeling configurations. Additionally, we release our initial models: EuroLLM-1.7B and EuroLLM-1.7B-Instruct and report their performance on multilingual general benchmarks and machine translation.
Pedro Henrique Martins, Patrick Fernandes, Jo\~ao Alves, Nuno M. Guerreiro, Ricardo Rei, Duarte M. Alves, Jos\'e Pombal, Amin Farajian, Manuel Faysse, Mateusz Klimaszewski, Pierre Colombo, Barry Haddow, Jos\'e G. C. de Souza, Alexandra Birch, Andr\'e F. T. Martins• 2024
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-task Language Understanding | MMLU | Accuracy28.3 | 876 | |
| Instruction Following | IFEval | -- | 625 | |
| Commonsense Reasoning | WinoGrande | Accuracy57.8 | 372 | |
| Science Question Answering | ARC Challenge | Accuracy35.9 | 342 | |
| Common Sense Reasoning | HellaSwag | Accuracy45.9 | 213 | |
| Question Answering | ARC-C | Accuracy31.57 | 192 | |
| Science Question Answering | ARC Easy | Accuracy71.3 | 155 | |
| Multitask Language Understanding | MMLU-Pro | Accuracy10.9 | 118 | |
| Commonsense Reasoning | SocialIQA | Accuracy44.8 | 116 | |
| Emotional Intelligence | Polish EQ-Bench | Overall Score54.1 | 106 |
Showing 10 of 91 rows
...