Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Functional Coding on HumanEval (pass@k)
Loading...
14.3
Pass@1
Gumbel
11.492
12.221
12.95
13.679
May 9, 2026
Pass@1
Pass@5
Pass@10
Updated 21d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@5
Pass@10
Gumbel
2026.05
14.3
20.5
25.6
No Watermark
2026.05
14
20.7
28.1
Inverse Transform
2026.05
13.8
22
27.5
SimplexWater
2026.05
13.7
22.7
25.3
HeavyWater
2026.05
13.1
18.9
27.8
Red/Green
delta=3
2026.05
11.6
19.5
23.2
Feedback
Search any
task
Search any
task