Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Code Generation and Functional Correctness on HumanEval
Loading...
206.79
Output Throughput
Our approach
202.1308
203.3404
204.55
205.7596
Mar 5, 2026
Output Throughput
Percentage Difference
Updated 2mo ago
Evaluation Results
Method
Method
Links
Output Throughput
Percentage Difference
Our approach
Vocabulary size=13264,...
2026.03
206.79
2.2
128k-vocab
Vocabulary size=128k,...
2026.03
202.31
-
Feedback
Search any
task
Search any
task