Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Coding Ability on HumanEval (test)

28.49Accuracy

Alpaca-GPT4 + NAIT (CodeX)

23.75824.986526.21527.4435Mar 13, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
28.4937.062.88
2026.03
27.9237.163.15
2026.03
27.8736.03-
2026.03
27.8437.74.65
2026.03
27.7536.170.39
2026.03
26.4437.23.24
2026.03
25.5535.69-0.94
2026.03
25.2337.183.2
2026.03
25.1935.68-0.98
2026.03
25.1536.471.23
2026.03
25.0236.711.89
2026.03
23.9435.18-2.34