Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Code Generation on RepoBench-P Python, XF-Random

64.5Execution Match (EM)

Ours

42.24448.02253.859.578May 18, 2026
Updated 15d ago

Evaluation Results

MethodLinks
2026.05
64.579.21183.8
2026.05
64.178.82455.6
2026.05
63.878.52685.9
2026.05
60.276.82856.2
2026.05
58.675.43126.8
2026.05
43.166.8451.8