| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Illicit task completion | AgentHarm English prompts | AgentHarm Score (AHS)72.7 | 20 | |
| Step-level tool invocation safety detection | AgentHarm Traj | Accuracy84.81 | 20 | |
| Guarded Agent Evaluation | AgentHarm latest (full) | Refusal Rate97.16 | 14 |