Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Code Optimization on HumanEval-Hard Cross-Domain

13.6Calls per Task

TextBFGS

12.71218.70624.730.694Jan 20, 2026
Updated 3mo ago

Evaluation Results

MethodLinks
2026.01
13.61,581.721.6
2026.01
13.91,594.422.2
2026.01
171,481.325.2
29.81,464.243.7
2026.01
35.8863.930.9