Share your thoughts, 1 month free Claude Pro on usSee more

General LLM Evaluation on Instruction-Following, Mathematics, and Commonsense Reasoning Combined

57Average Score

Qwen2.5 7B-PC

Updated 4mo ago

Evaluation Results

Method	Links
Qwen2.5 7B-PC 2025.05		57
Granite 3.1 8B-PC 2025.05		56
Olmo 3 7B-PC 2025.05		55
Llama 3.1 8B-PC 2025.05		54
Qwen2.5 7B-NL 2025.05		53
Olmo 3 7B-NL 2025.05		50
Mistral 7B v0.3-PC 2025.05		48
Qwen2.5 7B-CP 2025.05		47
Llama 3.1 8B-NL 2025.05		45
Granite 8B Code-PC 2025.05		43
Olmo 3 7B-CP 2025.05		39
Granite 3.1 8B-CP 2025.05		37
Granite 8B Code-NL 2025.05		36
Mistral 7B v0.3-NL 2025.05		35
Llama 3.1 8B-CP 2025.05		34
Mistral 7B v0.3-CP 2025.05		33
Granite 8B Code-CP 2025.05		29
Granite 3.1 8B-NL 2025.05		26