Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Code Generation on HumanEval (Score %)
Loading...
57.3
Score (%)
Qwen3-4B
40.868
45.134
49.4
53.666
May 29, 2026
Score (%)
Updated 2d ago
Evaluation Results
Method
Method
Links
Score (%)
Qwen3-4B
Parameters=4B
2026.05
57.3
Qwen2.5-7B
Parameters=7B
2026.05
55.5
Qwen3.5-4B
Parameters=4B
2026.05
50
OLMo-3-7B
Parameters=7B
2026.05
45.1
Mellum 2
Parameters=2.5B/12B
2026.05
41.5
Feedback
Search any
task
Search any
task