Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agentic Task Execution on AgentDojo Slack
Loading...
86.7
BU
Delimiter
25.236
41.193
57.15
73.107
Jan 18, 2026
BU
UUA
ASR
Updated 4d ago
Evaluation Results
Method
Method
Links
BU
UUA
ASR
Delimiter
2026.01
86.7
62.1
32.4
Repeat prompt
2026.01
82.7
62.5
29.5
Baseline
2026.01
80.8
60.9
56.4
AGENTRIM
2026.01
80
58.9
12.7
Tool filter
2026.01
66.2
47.5
4
PI detector
2026.01
27.6
18.9
7.6
Feedback
Search any
task
Search any
task