Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HumanEval-X

Benchmarks

Task NameDataset NameSOTA ResultTrend
Code GenerationHumanEval-X C++
Compilation Success Rate94.5
13
Python CodingHumanEval-X v1 (test)
Pass@148.2
13
Code TranslationHumanEval-X (test)
CA@1 (C++ to Java)87.19
11
Code Translation (Go to Java)HumanEval-X 1.0 (test)
Pass@154.22
4
Code Translation (Go to C++)HumanEval-X 1.0 (test)
Pass@138.97
4
Code Translation (JavaScript to Go)HumanEval-X 1.0 (test)
Pass@133.38
4
Code Translation (JavaScript to Java)HumanEval-X 1.0 (test)
Pass@156.55
4
Code Translation (JavaScript to C++)HumanEval-X 1.0 (test)
Pass@146.87
4
Code Translation (Java to Go)HumanEval-X 1.0 (test)
Pass@134
4
Code Translation (Java to C++)HumanEval-X 1.0 (test)
Pass@149.67
4
Code Translation (Java to Python)HumanEval-X 1.0 (test)
Pass@175.03
4
Code Translation (C++ to JavaScript)HumanEval-X 1.0 (test)
Pass@154.51
4
Code Translation (C++ to Java)HumanEval-X 1.0 (test)
Pass@171.68
4
Code Translation (C++ to Python)HumanEval-X 1.0 (test)
Pass@162.79
4
Code Translation (Python to Go)HumanEval-X 1.0 (test)
Pass@128.87
4
Code Translation (Python to Java)HumanEval-X 1.0 (test)
Pass@141.98
4
Code Translation (Python to C++)HumanEval-X 1.0 (test)
Pass@135.94
4
Code TranslationHumanEval-X
Pass@1 (Python)89.8
4
Showing 18 of 18 rows