Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agentic Task Execution on AgentDojo Banking
Loading...
91.2
BU
Repeat prompt
36.704
50.852
65
79.148
Jan 18, 2026
BU
UUA
ASR
Updated 4d ago
Evaluation Results
Method
Method
Links
BU
UUA
ASR
Repeat prompt
2026.01
91.2
82.3
22.7
Delimiter
2026.01
85.6
75.2
28.3
AGENTRIM
2026.01
80
72.4
0.9
Baseline
2026.01
78.7
76.5
36.5
Tool filter
2026.01
70
62.1
11.4
PI detector
2026.01
38.8
31.1
0.8
Feedback
Search any
task
Search any
task