Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Language Generation on P3B3
Loading...
95.9
General Score
AMALIA-9B-DPO
15.508
36.379
57.25
78.121
Mar 27, 2026
General Score
Updated 19d ago
Evaluation Results
Method
Method
Links
General Score
AMALIA-9B-DPO
Model Type=Fully open...
2026.03
95.9
AMALIA-9B-SFT
Model Type=Fully open...
2026.03
91.3
Gemma 3-12B
Model Type=Open weight...
2026.03
88.3
Gemma 2-9B
Model Type=Open weight...
2026.03
72.1
EuroLLM-9B
Model Type=Fully open...
2026.03
70.5
Salamandra-7B
Model Type=Fully open...
2026.03
42.7
Apertus-8B
Model Type=Fully open...
2026.03
28.1
Llama 3.1-8B
Model Type=Open weight...
2026.03
27.8
Gervasio-8B
Model Type=Open weight...
2026.03
24.7
Ministral-8B
Model Type=Open weight...
2026.03
22.1
Qwen 2.5-7B
Model Type=Open weight...
2026.03
20
Mistral-7B
Model Type=Open weight...
2026.03
19.2
Qwen 3-8B
Model Type=Open weight...
2026.03
18.9
OLMo 2-7B
Model Type=Fully open...
2026.03
18.6
Feedback
Search any
task
Search any
task