Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Code Generation on HumanEval 0-shot (test)

45.7Accuracy

Prefix-Cache

32.408835.859439.3142.7606Dec 8, 2025Jan 5, 2026Feb 3, 2026Mar 4, 2026Apr 1, 2026Apr 30, 2026May 29, 2026
Updated 2d ago

Evaluation Results

MethodLinks
2026.01
45.7--21.41.8-----
2025.12
45.12----94.411---
2026.01
45.1--32.91.2-----
2026.01
44.5--94.4-----
2026.01
44.5--5.27.6-----
2026.01
43.9--39.41-----
2026.05
43.7---3.824.7----
2026.01
43.4--2.55.3-----
2026.01
43.3--3.43.9-----
2026.05
43.3---3.724.1----
2025.12
43.29----163.721.74---
2026.05
43---8.253.3----
2026.01
42.7--7.91.7-----
2026.05
42.5---4.227.3----
2026.01
41.5--13.31-----
2026.05
41.5---16.5----
2026.05
41.5---1.27.8----
2025.12
41.46----131.141.38---
2026.05
40.3---3.623.4----
2026.01
39.6--12.91-----
2025.12
37.19----96.841---
2025.12
35.97----220.812.28---
2025.12
32.92----132.281.37---
2025.12
-26.8--------
2025.12
-37.8--------
2025.12
-48.8--------
2025.12
--37.8-------
2025.12
-25.6--------
2025.12
-36--------
2025.12
-50--------
2025.12
--37.2-------
2025.12
-34.7--------
2025.12
-38.4--------
2025.12
-38.4--------
2025.12
--37.2-------
2025.12
-28.1--------
2025.12
-42.1--------
2025.12
-50--------
2025.12
--40.1-------
2025.12
-29.3--------
2025.12
-39.6--------
2025.12
-51.9--------
2025.12
--40.3-------
2025.09
-46.6-11.899.748.39----
2025.09
-57.3-6.323.8104.4----
2025.11
---11.3-7.4-25683.937.8
2025.11
---10.8-12.6-256135.939
2025.11
---4.6-17.9-100.383.136.6
2025.11
---4.2-18.9-97.780.236
2025.11
---1.9-50.9-32.395.140.2
2026.03
-32.32---23.84----
2026.03
-32.32---177.4----
2026.03
-31.71---232.66----