Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agent Task Performance on AgentDojo Slack
Loading...
67.62
Utility
No Defense
61.6712
63.2156
64.76
66.3044
Dec 21, 2024
Utility
Attack Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Utility
Attack Success Rate
No Defense
Model=GPT-4o, Attack T...
2024.12
67.62
1,333
Task Shield
Model=GPT-4o, Attack T...
2024.12
66.67
95
Task Shield
Model=GPT-4o, Attack T...
2024.12
64.76
95
No Defense
Model=GPT-4o, Attack T...
2024.12
63.81
9,238
Task Shield
Model=GPT-4o, Attack T...
2024.12
63.81
95
No Defense
Model=GPT-4o, Attack T...
2024.12
61.9
2,095
Feedback
Search any
task
Search any
task