Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reasoning on Hebrew Reasoning Benchmarks Suite (Copa, ARC-AI2, HellaSwag, MMLU, GSM8K, Psychometric Psi)
Loading...
93.3
Copa (HE)
Gemma-3-27B-IT
87.788
89.219
90.65
92.081
May 11, 2026
Copa (HE)
ARC-AI2 (HE)
HellaSwag (HE)
MMLU (HE)
GSM8K (HE)
Psychometric Psi (HE)
Hebrew Average Score
Updated 21d ago
Evaluation Results
Method
Method
Links
Copa (HE)
ARC-AI2 (HE)
HellaSwag (HE)
MMLU (HE)
GSM8K (HE)
Psychometric Psi (HE)
Hebrew Average Score
Gemma-3-27B-IT
Training Stage=SFT
2026.05
93.3
91.4
63.6
72.5
82.8
54.3
76.3
Hebatron
Training Stage=SFT
2026.05
91.9
88
58.9
68.4
83.3
52.5
73.8
DictaLM-3.0-Thinking
Training Stage=SFT
2026.05
88
91.2
61.7
60.2
70.2
42.3
68.9
Feedback
Search any
task
Search any
task