Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AgentDyn

Benchmarks

Task NameDataset NameSOTA ResultTrend
Dynamic Agent Security and Utility EvaluationAgentDyn
ASR52
22
Prompt Injection DefenseAgentDyn
ASR1
9
Agent Safety EvaluationAgentDyn
Benign Rate100
2
Showing 3 of 3 rows