Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Coding on HumanEval & MBPP
Loading...
81.7
HumanEval Score
Qwen2.5-7B-Instruct
63.292
68.071
72.85
77.629
Dec 18, 2025
HumanEval Score
MBPP Score
Average Code Score
Updated 4d ago
Evaluation Results
Method
Method
Links
HumanEval Score
MBPP Score
Average Code Score
Qwen2.5-7B-Instruct
Base Model Architectur...
2025.12
81.7
79.4
80.6
Qwen2.5-7B + DataFlow-Instruct-10K
Base Model Architectur...
2025.12
80.5
76.7
78.6
Qwen2.5-7B-Base
Base Model Architectur...
2025.12
78.7
74.3
76.5
Qwen2.5-7B + Inf-1M
Base Model Architectur...
2025.12
78
78
78
Qwen2.5-7B + Inf-10K
Base Model Architectur...
2025.12
77.4
77.8
77.6
Qwen2-7B-Instruct
Base Model Architectur...
2025.12
73.8
65.3
69.6
Qwen2-7B-Base
Base Model Architectur...
2025.12
66.5
66.1
66.3
Qwen2-7B + Inf-1M
Base Model Architectur...
2025.12
65.9
70.4
68.2
Qwen2-7B + DataFlow-Instruct-10K
Base Model Architectur...
2025.12
64.6
67.7
66.2
Qwen2-7B + Inf-10K
Base Model Architectur...
2025.12
64
71.7
67.8
Feedback
Search any
task
Search any
task