Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HumanEval+

Benchmarks

Task NameDataset NameSOTA ResultTrend
Code GenerationHumanEval+ (test)
Pass@181.7
81
Code GenerationHumanEval+ v1 (test)
Pass Rate87.8
41
Unit test generationHumanEval+ (test)
Error Rate1.27
7
Code ReasoningHumanEval+
Average Score @1682.29
6
Code GenerationHumanEval+ ko
Score92.1
3
Showing 5 of 5 rows