Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Small Languages, Big Models: A Study of Continual Training on Languages of Norway

About

Training large language models requires vast amounts of data, posing a challenge for less widely spoken languages like Norwegian and even more so for truly low-resource languages like Northern S\'ami. To address this issue, we present a novel three-stage continual training approach that substantially improves the downstream performance together with the inference efficiency for the target languages. Based on our findings, we train, evaluate, and openly release a new generative language model for Norwegian Bokm\r{a}l, Nynorsk, and Northern S\'ami with 11.4 billion parameters: NorMistral-11B.

David Samuel, Vladislav Mikhailov, Erik Velldal, Lilja {\O}vrelid, Lucas Georges Gabriel Charpentier, Andrey Kutuzov, Stephan Oepen• 2024

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningWinoGrande
Accuracy72
1085
Commonsense ReasoningWinoGrande
Accuracy49.8
372
Truthfulness EvaluationTruthfulQA
Accuracy19.7
103
Common Sense ReasoningPIQA
Accuracy0.00e+0
71
General KnowledgeMMLU-Redux
Accuracy34.2
30
Chatbot EvaluationAI Barometer Estonian Chatbot Arena 19.02.2026
Score1.24e+3
20
Question AnsweringBelebele English
Accuracy45
18
Instruction FollowingIFEval EN
Score43.7
12
Academic Question AnsweringNational Exam Estonian
Accuracy36.5
10
Commonsense ReasoningWinogrande Estonian
Accuracy50.4
10
Showing 10 of 19 rows

Other info

Follow for update