Share your thoughts, 1 month free Claude Pro on usSee more

Robustness Evaluation on MMLU

88VAcc

DeepSeek-R1-Distill-LLaMA-8B

Updated 1mo ago

Evaluation Results

Method	Links
DeepSeek-R1-Distill-LLaMA-8B 2025.06		88	58	30	34.09
LLaMA-3-8B-Instruct 2025.06		69	47	22	31.88
Gemma-2-2B-IT 2025.06		48	21	27	56.25
LLaMA-3.2-1B-Instruct 2025.06		42	5	37	88.1