| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Jailbreak Attack | MaliciousInstruct | ASR100 | 161 | |
| Visual Jailbreaking Attack | MaliciousInstruct | ASR92 | 61 | |
| Jailbreaking | MaliciousInstruct (test) | BERT Score4.63 | 32 | |
| Adversarial and Jailbreaking Attack Detection | MaliciousInstruct | AUROC0.8825 | 20 | |
| Jailbreaking | MaliciousInstruct | BERT Score4.8 | 11 | |
| Jailbreak Attack | MaliciousInstruct (test) | ASR (Refusal)95 | 10 | |
| Jailbreak Attack | MaliciousInstruct | ASR (Google Banana 2)62 | 8 | |
| Jailbreak Attack | MaliciousInstruct 41 (test) | ASR0.935 | 6 |