Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Language Understanding on MMLU 1-order
Loading...
55.2
Accuracy
Raw
48.648
50.349
52.05
53.751
Mar 17, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Raw
Model=Llama-3.1-8B
2026.03
55.2
MOSAIC
Model=Llama-3.1-8B
2026.03
55.1
ORPO
Model=Llama-3.1-8B
2026.03
54.9
SFT
Model=Llama-3.1-8B
2026.03
54.7
In-context
Model=Llama-3.1-8B
2026.03
53.4
Raw
Model=Llama-3.2-3B
2026.03
50.7
MOSAIC
Model=Llama-3.2-3B
2026.03
49.8
SFT
Model=Llama-3.2-3B
2026.03
49.2
In-context
Model=Llama-3.2-3B
2026.03
48.9
ORPO
Model=Llama-3.2-3B
2026.03
48.9
Feedback
Search any
task
Search any
task