Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool Execution on Trace-based setting
Loading...
14.8
Improvement (%)
Trace-based
9.08
10.565
12.05
13.535
Feb 23, 2026
Improvement (%)
Degradation (%)
Avg Delta
Updated 4d ago
Evaluation Results
Method
Method
Links
Improvement (%)
Degradation (%)
Avg Delta
Trace-based
2026.02
14.8
14.8
0.0153
D2
2026.02
14.5
14
0.0172
DRAFT
2026.02
11.1
10.9
0.0145
Play2Prompt
2026.02
9.3
9.8
-0.0002
Feedback
Search any
task
Search any
task