Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MultiPL-E

Benchmarks

Task NameDataset NameSOTA ResultTrend
Code GenerationMultiPL-E
Average Score76.5
35
CodingMultiPL-E
Score74.87
20
Code GenerationMultiPL-E
Average Pass@179.5
19
Code GenerationMultiPL-E HumanEval translated from Python
C++ Pass Rate54.6
17
Multilingual Code CompletionMultipl-E
Pass@131.14
12
Code GenerationMultiPL-E 2022 (test)
Java44.9
10
Code GenerationMultiPL-E MBPP
Score58.8
9
Code GenerationMultiPL-E Java
Pass@142.07
6
Code GenerationMultiPL-E
Pass@1 (Lua)42
6
Code SynthesisMultiPL-E
Success Rate (Lua)68
5
Code GenerationMultiPL-E
Accuracy59.6
5
Single line code infillingMultiPL-E
Python SPM Exact Match74.5
5
Code GenerationMultiPL-E v1 (test)
Accuracy59.1
3
Showing 13 of 13 rows