Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs

About

Open-weight LLMs have been released by frontier labs; however, sovereign Large Language Models (for languages other than English) remain low in supply yet high in demand. Training large language models (LLMs) for low-resource languages such as Hebrew poses unique challenges. In this paper, we introduce Dicta-LM 3.0: an open-weight collection of LLMs trained on substantially-sized corpora of Hebrew and English texts. The model is released in three sizes: 24B - adapted from the Mistral-Small-3.1 base model, 12B - adapted from the NVIDIA Nemotron Nano V2 model, and 1.7B - adapted from the Qwen3-1.7B base model. We are releasing multiple variants of each model, each with a native context length of 65k tokens; base model and chat model with tool-calling support. To rigorously evaluate our models, we introduce a new benchmark suite for evaluation of Hebrew chat-LLMs, covering a diverse set of tasks including Translation, Summarization, Winograd, Israeli Trivia, and Diacritization (nikud). Our work not only addresses the intricacies of training LLMs in low-resource languages but also proposes a framework that can be leveraged for adapting other LLMs to various non-English languages, contributing to the broader field of multilingual NLP.

Shaltiel Shmidman, Avi Shmidman, Amir DN Cohen, Moshe Koppel• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningMATH
Accuracy74.99
643
Instruction FollowingIFEval--
292
KnowledgeMMLU
Accuracy85.93
71
KnowledgeGPQA
Accuracy55.13
34
MathematicsMATH
MATH Accuracy86.41
32
Mathematical ReasoningOMEGA
Score15.19
28
Chat EvaluationAlpacaEval LC 2
Score74.11
23
Question AnsweringPopQA
Accuracy26.31
16
ReasoningAGI Eval EN
Accuracy82.93
15
MathOMEGA
Accuracy28.38
13
Showing 10 of 17 rows

Other info

Follow for update