Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Agent Action

Benchmarks

Task NameDataset NameSOTA ResultTrend
Indirect Prompt InjectionAgent Action Subset 2
IR0
24
Agent ActionAgent Action subset
RR98
12
Prompt Injection Attack SuccessAgent Action
IR85
10
Indirect Prompt Injection Attack Success EvaluationAgent Action Goal-Distant 2
IRany88
5
Indirect Prompt Injection Attack Success EvaluationAgent Action Goal-Adjacent 2
IRany85
5
Showing 5 of 5 rows