Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Code Domain

Benchmarks

Task NameDataset NameSOTA ResultTrend
Code GenerationCode Domain HumanEval, HumanEval+, MBPP, MBPP+, Bigcode (test)
HumanEval48.2
18
Text-Based OptimizationCode Domain (test)
Accuracy89.89
5
Showing 2 of 2 rows