Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code Generation on HumanEval (pass@1, pass@5)
Loading...
53.05
Pass@1
CAA
26.4156
33.3303
40.245
47.1597
Oct 4, 2025
Pass@1
Pass@5
Updated 3d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@5
CAA
Backbone=Llama-3.2-3B-...
2025.10
53.05
59.21
RS
Backbone=Llama-3.2-3B-...
2025.10
52.13
55.78
STaR
Backbone=Llama-3.2-3B-...
2025.10
52.13
57.35
FA
Backbone=Llama-3.2-3B-...
2025.10
48.17
55.95
ToT
Backbone=Llama-3.2-3B-...
2025.10
35.73
49.51
Base Model
Backbone=Llama-3.2-3B-...
2025.10
27.44
39.02
Feedback
Search any
task
Search any
task