Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Tasks on MMLU-redux
Loading...
86.8
Accuracy
Qwen2.5-72B
69.848
74.249
78.65
83.051
Feb 19, 2025
May 5, 2025
Jul 19, 2025
Oct 3, 2025
Dec 17, 2025
Mar 2, 2026
May 17, 2026
Accuracy
Updated 14d ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen2.5-72B
Model Type=Instruct, P...
2025.02
86.8
Llama-3.1-405B
Model Type=Instruct, P...
2025.02
86.2
Qwen2.5-VL-72B
Model Type=Instruct, P...
2025.02
85.9
Llama-3.1-70B
Model Type=Instruct, P...
2025.02
83
Qwen2-72B
Model Type=Instruct, P...
2025.02
81.6
Fine-tuned
Base LLM=Qwen-2.5-7B-I...
2026.05
76.6
DiDi-Merg.-L
Base LLM=Qwen-2.5-7B-I...
2026.05
76.2
FREE-Merging
Base LLM=Qwen-2.5-7B-I...
2026.05
75.2
Zero-shot
Base LLM=Qwen-2.5-7B-I...
2026.05
75.1
Twin-Merging
Base LLM=Qwen-2.5-7B-I...
2026.05
74.8
Fine-tuned
Base LLM=Llama-3.1-8B-...
2026.05
73
DiDi-Merg.-L
Base LLM=Llama-3.1-8B-...
2026.05
72.8
Twin-Merging
Base LLM=Llama-3.1-8B-...
2026.05
72.6
FREE-Merging
Base LLM=Llama-3.1-8B-...
2026.05
72.4
Zero-shot
Base LLM=Llama-3.1-8B-...
2026.05
70.5
Feedback
Search any
task
Search any
task