Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Coding Agent on Bird
Loading...
43.83
Pass@1
TDScaling
30.6532
34.0741
37.495
40.9159
Feb 3, 2026
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
TDScaling
Category=Proposed Impl...
2026.02
43.83
Qwen3-Coder-30B-A3B-Instruct
Category=Baseline
2026.02
41.48
APIGen-MT
Category=Tool-Learning...
2026.02
34.18
TOUCAN
Category=Tool-Learning...
2026.02
32.89
Simia
Category=Tool-Learning...
2026.02
31.16
Feedback
Search any
task
Search any
task