Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agent Task Performance on AgentDojo Overall
Loading...
69.79
Utility
Task Shield
49.2916
54.6133
59.935
65.2567
Dec 21, 2024
Utility
Attack Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Utility
Attack Success Rate
Task Shield
Model=GPT-4o, Attack T...
2024.12
69.79
0.0207
Task Shield
Model=GPT-4o, Attack T...
2024.12
69.48
0.0111
No Defense
Model=GPT-4o, Attack T...
2024.12
68.52
0.0572
Task Shield
Model=GPT-4o, Attack T...
2024.12
66.93
0.0048
No Defense
Model=GPT-4o, Attack T...
2024.12
66.77
0.0541
No Defense
Model=GPT-4o, Attack T...
2024.12
50.08
0.4769
Feedback
Search any
task
Search any
task