Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs

About

Open-weight LLMs have been released by frontier labs; however, sovereign Large Language Models (for languages other than English) remain low in supply yet high in demand. Training large language models (LLMs) for low-resource languages such as Hebrew poses unique challenges. In this paper, we introduce Dicta-LM 3.0: an open-weight collection of LLMs trained on substantially-sized corpora of Hebrew and English texts. The model is released in three sizes: 24B - adapted from the Mistral-Small-3.1 base model, 12B - adapted from the NVIDIA Nemotron Nano V2 model, and 1.7B - adapted from the Qwen3-1.7B base model. We are releasing multiple variants of each model, each with a native context length of 65k tokens; base model and chat model with tool-calling support. To rigorously evaluate our models, we introduce a new benchmark suite for evaluation of Hebrew chat-LLMs, covering a diverse set of tasks including Translation, Summarization, Winograd, Israeli Trivia, and Diacritization (nikud). Our work not only addresses the intricacies of training LLMs in low-resource languages but also proposes a framework that can be leveraged for adapting other LLMs to various non-English languages, contributing to the broader field of multilingual NLP.

Shaltiel Shmidman, Avi Shmidman, Amir DN Cohen, Moshe Koppel• 2026

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	MATH	Accuracy74.99	882
Instruction Following	IFEval	IFEval Accuracy88.17	836
Knowledge	MMLU	Accuracy85.93	161
Mathematics	MATH	MATH Accuracy86.41	136
Question Answering	PopQA	Accuracy26.31	103
Knowledge	GPQA	Accuracy55.13	51
Mathematical Reasoning	OMEGA	Score15.19	28
Chat Evaluation	AlpacaEval LC 2	Score74.11	23
Reasoning	BigBenchHard	Accuracy (BigBenchHard)73	22
Reasoning	AGI Eval EN	Accuracy82.93	15

Showing 10 of 24 rows

Other info

Follow for update

@wizwand_team Discord