| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Guarded Agent Evaluation | ASB latest (IPI) | ASR5.5 | 14 | |
| Guarded Agent Evaluation | ASB latest (DPI) | ASR95.25 | 14 | |
| Agent Security | ASB (Agent Security Benchmark) | No Attack UA90 | 8 | |
| Agent Security Evaluation | ASB (Agent Security Benchmark) | ASR-d (ASB)7 | 8 | |
| Secure LLM Agent Task Completion | ASB | Benign Utility78.75 | 4 |