Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Natural Language Understanding and Reasoning on General Benchmarks Italian
Loading...
37.47
ARC-C-it
Qwen2.5
29.0148
31.2099
33.405
35.6001
Dec 25, 2025
ARC-C-it
Belebele-it
Hellaswag-it
MMLU (Global)
Evalita-LLM (agg.)
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
ARC-C-it
Belebele-it
Hellaswag-it
MMLU (Global)
Evalita-LLM (agg.)
Average Score
Qwen2.5
2025.12
37.47
70.44
47.55
46.71
43.33
49.1
Gamayun
2025.12
35.9
67.22
52.63
41.87
41.28
47.78
Qwen3
2025.12
35.33
63.89
45.61
49.13
40.24
46.84
Gemma3
2025.12
34.99
42.56
47.43
37.19
36.27
39.69
EuroLM
2025.12
32.08
24.56
50.75
27.4
33.96
33.75
Llama3.2
2025.12
29.34
48
42.34
38.64
35.65
38.79
Feedback
Search any
task
Search any
task