Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agent Task Performance on AgentDojo Workspace
Loading...
66.67
Utility
Task Shield
22.47
33.945
45.42
56.895
Dec 21, 2024
Utility
Attack Success Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Utility
Attack Success Rate
Task Shield
Model=GPT-4o, Attack T...
2024.12
66.67
0
No Defense
Model=GPT-4o, Attack T...
2024.12
64.58
0
Task Shield
Model=GPT-4o, Attack T...
2024.12
62.92
0
Task Shield
Model=GPT-4o, Attack T...
2024.12
62.5
42
No Defense
Model=GPT-4o, Attack T...
2024.12
61.67
0
No Defense
Model=GPT-4o, Attack T...
2024.12
24.17
4,042
Feedback
Search any
task
Search any
task