Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HumanEval-X

Benchmarks

Task NameDataset NameSOTA ResultTrend
Python CodingHumanEval-X v1 (test)
Pass@148.2
13
Code Translation (Go to Java)HumanEval-X 1.0 (test)
Pass@154.22
4
Code Translation (Go to C++)HumanEval-X 1.0 (test)
Pass@138.97
4
Code Translation (JavaScript to Go)HumanEval-X 1.0 (test)
Pass@133.38
4
Code Translation (JavaScript to Java)HumanEval-X 1.0 (test)
Pass@156.55
4
Code Translation (JavaScript to C++)HumanEval-X 1.0 (test)
Pass@146.87
4
Code Translation (Java to Go)HumanEval-X 1.0 (test)
Pass@134
4
Code Translation (Java to C++)HumanEval-X 1.0 (test)
Pass@149.67
4
Code Translation (Java to Python)HumanEval-X 1.0 (test)
Pass@175.03
4
Code Translation (C++ to JavaScript)HumanEval-X 1.0 (test)
Pass@154.51
4
Code Translation (C++ to Java)HumanEval-X 1.0 (test)
Pass@171.68
4
Code Translation (C++ to Python)HumanEval-X 1.0 (test)
Pass@162.79
4
Code Translation (Python to Go)HumanEval-X 1.0 (test)
Pass@128.87
4
Code Translation (Python to Java)HumanEval-X 1.0 (test)
Pass@141.98
4
Code Translation (Python to C++)HumanEval-X 1.0 (test)
Pass@135.94
4
Code TranslationHumanEval-X
Pass@1 (Python)89.8
4
Showing 16 of 16 rows