| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Jailbreak Attack | StrongREJECT | Attack Success Rate86.3 | 88 | |
| Safety Evaluation | StrongReject | Attack Success Rate0.64 | 45 | |
| Red-teaming Safety Evaluation | StrongReject | ASR7 | 32 | |
| Multi-turn Jailbreaking | StrongReject (test) | ASR0.34 | 30 | |
| Safety assessment | StrongReject | Personalization Bias (PB)0.179 | 9 | |
| Jailbreak Attack | StrongREJECT 2024 (Full) | AK100 | 9 | |
| Safety | SR (StrongReject) | Safety Rate99.7 | 8 | |
| Jailbreak Robustness | StrongREJECT | Safe Response Rate95.39 | 8 | |
| Jailbreak Resistance | StrongReject | Jailbreak Resistance: Illicit Crime99.7 | 4 | |
| Safety Evaluation | StrongReject (test) | StrongReject Score100 | 4 | |
| Jailbreak Attack | StrongREJECT (test) | Full Success Rate98.8 | 3 | |
| Adversarial Robustness (GCG attack) | StrongREJECT (sample of 40 prompts) | Mean StrongREJECT Score26.87 | 3 | |
| Safety Alignment | StrongReject | Metric- | 0 |