Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MBPP

Benchmarks

Task NameDataset NameSOTA ResultTrend
Code GenerationMBPP (test)
Pass@195.1
298
Code GenerationMBPP+
Pass@184.39
216
Code GenerationMBPP
Pass@189.1
193
Code GenerationMBPP
Pass@191.8
159
Code GenerationMBPP
Accuracy79.8
159
Code GenerationMBPP
Accuracy (%)92.2
146
CodingMBPP
Accuracy98.4
116
Code GenerationMBPP+
Accuracy75.9
104
Code GenerationMBPP-ET
Pass@191.8
91
Code GenerationMBPP
Accuracy96.6
90
Code GeneratingMBPP
Pass@183.1
88
Code GenerationMBPP Plus (test)
Accuracy83.6
87
Code GenerationMBPP
Accuracy90.5
74
Code GenerationMBPP
Pass@1 Accuracy94.2
59
Function-level Code GenerationMBPP+ augmented (test)
Pass@179.6
56
Code GenerationMBPP
Tau Correlation9.94
55
CodingMBPP+
Pass@197.88
52
Code GenerationMBPP Sanitized
Accuracy85.7
51
CodeMBPP
Pass@177.9
49
Code GenerationMBPP+
Score94.2
43
Code generationMBPP
Pass@180.4
41
Code GenerationMBPP
Score58
38
Code GenerationMBPP
Accuracy68.8
36
Code GenerationMBPP
MBPP Score66.17
35
Code ReasoningMBPP
MBPP Execution Accuracy84.7
33
Showing 25 of 192 rows
...