Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Code Generation on HumanEval++

98.44Accuracy

AFlow

5.526429.648253.7777.8918Nov 11, 2025Nov 28, 2025Dec 15, 2025Jan 2, 2026Jan 19, 2026Feb 5, 2026Feb 23, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.02
98.44
2026.02
98.44
2026.02
96.88
2026.02
96.88
2026.02
95.31
2026.02
93.75
2026.02
93.75
2026.02
92.19
2026.02
92.19
2026.02
90.62
2026.02
90.62
2026.02
89.06
2026.02
89.06
2026.02
89.06
2026.02
87.5
2026.02
87.5
2026.02
87.5
2026.02
87.5
2026.02
85.94
2026.02
85.94
2026.02
85.94
2026.02
84.14
2025.11
72
2025.11
70.7
2025.11
70.1
2025.11
69.2
2025.11
69.2
2025.11
51.5
2025.11
50
2025.11
43.3
2025.11
40.9
2025.11
39
2025.11
22
2025.11
21.6
2025.11
18.9
2025.11
16.8
2025.11
16.4
2025.11
14.3
2025.11
9.1