Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ASB

Benchmarks

Task NameDataset NameSOTA ResultTrend
Guarded Agent EvaluationASB latest (IPI)
ASR5.5
14
Guarded Agent EvaluationASB latest (DPI)
ASR95.25
14
Agent SecurityASB (Agent Security Benchmark)
No Attack UA90
8
Agent Security EvaluationASB (Agent Security Benchmark)
ASR-d (ASB)7
8
Secure LLM Agent Task CompletionASB
Benign Utility78.75
4
Showing 5 of 5 rows