Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multi-turn tool calling on τ2-bench

17.77Overall Score

Toucan

13.391614.528315.66516.8017Jan 6, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
17.772022.810.5
2026.01
17.62386.18.77
2026.01
17.02306.1414.91
2026.01
16.081417.5416.7
2026.01
13.56247.029.65