Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agentic Task Execution on AgentDojo Workspace
Loading...
76.5
BU
AGENTRIM
46.86
54.555
62.25
69.945
Jan 18, 2026
BU
UUA
ASR
Updated 4d ago
Evaluation Results
Method
Method
Links
BU
UUA
ASR
AGENTRIM
2026.01
76.5
69.7
0
Repeat prompt
2026.01
72
67.7
3.5
Baseline
2026.01
61.5
47.1
13.4
Delimiter
2026.01
53
47.9
8.31
Tool filter
2026.01
50.5
50.1
1.6
PI detector
2026.01
48
26.2
8.4
Feedback
Search any
task
Search any
task