Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HumanEval and MBPP

Benchmarks

Task NameDataset NameSOTA ResultTrend
Code GenerationHumanEval and MBPP
Overall Average Score85.6
30
Code GenerationHumanEval and MBPP EvalPlus
HumanEval+ Pass@k70.1
29
Code-writingHumanEval & MBPP EvalPlus (test)
HumanEval Pass Rate39.02
4
Showing 3 of 3 rows