Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Code Generation on BIRD-Python Original (dev)

0.6584Execution Accuracy (Simple)

Qwen3 (14B)

-0.0094880.1639060.33730.510694Jan 22, 2026
Updated 2d ago

Evaluation Results

MethodLinks
2026.01
0.65840.32330.29660.4289
2026.01
0.62380.45040.48280.5574
2026.01
0.62270.43750.47590.5528
2026.01
0.60320.45260.42070.5398
2026.01
0.60220.42240.45520.5332
2026.01
0.59570.45040.42760.5359
2026.01
0.58380.41810.37930.5143
2026.01
0.5470.36210.34480.472
2026.01
0.51240.33840.35860.4452
2026.01
0.46920.34050.32410.4153
2026.01
0.39890.18970.23450.3194
2026.01
0.23150.15090.11030.1745
2026.01
0.0750.03880.01720.0488
2026.01
0.01620.004300.0085