Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Coding ability on OpenAI HumanEval (Score, TPF)

51.22HumanEval Score

Baseline

33.467238.076142.68547.2939Oct 7, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.10
51.221
2025.10
51.224.97
2025.10
51.226
2025.10
36.594.69
2025.10
34.761
2025.10
34.153.82