Share your thoughts, 1 month free Claude Pro on usSee more

Robustness Evaluation on CommonsenseQA

79.07VAcc

DeepSeek-R1-Distill-LLaMA-8B

Updated 1mo ago

Evaluation Results

Method	Links
DeepSeek-R1-Distill-LLaMA-8B 2025.06		79.07	26.62	52.45	66.33
LLaMA-3-8B-Instruct 2025.06		75.84	51.32	24.52	32.33
Gemma-2-2B-IT 2025.06		58.31	34.6	23.71	40.67
LLaMA-3.2-1B-Instruct 2025.06		51.92	16.96	34.96	67.33