Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code Generation on HumanEval v1 (test)
Loading...
86.6
Accuracy
PrefillShare
34.6
48.1
61.6
75.1
Feb 12, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
PrefillShare
Backbone=Qwen3-8B-Base...
2026.02
86.6
Full-FT
Backbone=Qwen3-8B-Base...
2026.02
83.5
Qwen3-8B-Base
KV Sharing=Inherent
2026.02
68.3
PrefillShare
Backbone=LLaMA3.1-8B,...
2026.02
48.8
Full-FT
Backbone=LLaMA3.1-8B,...
2026.02
48.2
LLaMA3.1-8B
KV Sharing=Inherent
2026.02
36.6
Feedback
Search any
task
Search any
task