Small Languages, Big Models: A Study of Continual Training on Languages of Norway

About

Training large language models requires vast amounts of data, posing a challenge for less widely spoken languages like Norwegian and even more so for truly low-resource languages like Northern S\'ami. To address this issue, we present a novel three-stage continual training approach that substantially improves the downstream performance together with the inference efficiency for the target languages. Based on our findings, we train, evaluate, and openly release a new generative language model for Norwegian Bokm\r{a}l, Nynorsk, and Northern S\'ami with 11.4 billion parameters: NorMistral-11B.

David Samuel, Vladislav Mikhailov, Erik Velldal, Lilja {\O}vrelid, Lucas Georges Gabriel Charpentier, Andrey Kutuzov, Stephan Oepen• 2024

Related benchmarks

Task	Dataset	Result
Commonsense Reasoning	WinoGrande	Accuracy72	1442
Commonsense Reasoning	WinoGrande	Accuracy49.8	453
Truthfulness Evaluation	TruthfulQA	Accuracy19.7	108
Common Sense Reasoning	PIQA	Accuracy0.00e+0	100
General Knowledge	MMLU-Redux	Accuracy34.2	30
Chatbot Evaluation	AI Barometer Estonian Chatbot Arena 19.02.2026	Score1.24e+3	20
Question Answering	Belebele English	Accuracy45	18
Instruction Following	IFEval EN	Score43.7	12
Academic Question Answering	National Exam Estonian	Accuracy36.5	10
Commonsense Reasoning	Winogrande Estonian	Accuracy50.4	10

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord