| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Jailbreak Attack Defense | FORTRESS | ASR9.8 | 24 | |
| Overrefusal Evaluation | Fortress OR | Helpfulness Score97.6 | 12 | |
| Jailbreaking Safety Evaluation | Fortress | Safety Score86.84 | 12 | |
| Non-Agentic Performance Evaluation | Fortress (test) | Mean Score78.75 | 4 | |
| Safety Evaluation | Fortress | Cost per Accuracy Point ($)0.0016 | 4 |