Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Language Understanding on MMLU 3-order
Loading...
55.2
Accuracy
Raw
46.88
49.04
51.2
53.36
Mar 17, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Raw
Model=Llama-3.1-8B
2026.03
55.2
MOSAIC
Model=Llama-3.1-8B
2026.03
55.1
ORPO
Model=Llama-3.1-8B
2026.03
54.9
SFT
Model=Llama-3.1-8B
2026.03
54.7
In-context
Model=Llama-3.1-8B
2026.03
52.7
Raw
Model=Llama-3.2-3B
2026.03
50.7
MOSAIC
Model=Llama-3.2-3B
2026.03
49.4
ORPO
Model=Llama-3.2-3B
2026.03
49.1
SFT
Model=Llama-3.2-3B
2026.03
49
In-context
Model=Llama-3.2-3B
2026.03
47.2
Feedback
Search any
task
Search any
task