Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Agentic Security Evaluation on AgentDojo v1 (97 benign tasks, 27 injection tasks)

60.8Utility Score

NoDefense

9.3222.68536.0549.415May 11, 2026
Updated 21d ago

Evaluation Results

MethodLinks
2026.05
60.888.9
2026.05
59.885.2
2026.05
59.888.9
2026.05
55.788.9
2026.05
50.596.3
2026.05
48.596.3
2026.05
46.496.3
2026.05
46.496.3
2026.05
46.4100
2026.05
45.4100
2026.05
42.3100
2026.05
38.1100
2026.05
30.9100
2026.05
29.9100
2026.05
29.9100
2026.05
28.9100
2026.05
20.692.6
2026.05
19.688.9
2026.05
18.6100
2026.05
11.3100