Share your thoughts, 1 month free Claude Pro on usSee more

Multitask Language Understanding on MMLU (Accuracy and Performance Gain)

73.5Accuracy

Qwen3 8B Base

Updated 4mo ago

Evaluation Results

Method	Links
Qwen3 8B Base 2026.01		73.5	-2.5
Qwen3 8B Base 2026.01		73.5	-2.5
Llama 3.1 8B 2026.01		63.9	-0.8
Llama 3.1 8B 2026.01		63.6	-1.1
Mistral Nemo Base 2407 2026.01		62.6	-1.3
Mistral Nemo Base 2407 2026.01		62.3	-1.6
Mistral 7B v0.3 2026.01		58.7	-1.3
Mistral 7B v0.3 2026.01		58.5	-1.5
Llama 3.2 1B 2026.01		43.8	1.5
Qwen3 0.6B 2026.01		42	-1
Qwen3 0.6B 2026.01		41.4	-1.6
Llama 3.2 1B 2026.01		41.1	-1.1