Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Robustness Evaluation on MMLU

88VAcc

DeepSeek-R1-Distill-LLaMA-8B

40.1652.586577.42Jun 5, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.06
88583034.09
2025.06
69472231.88
2025.06
48212756.25
2025.06
4253788.1