| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Jailbreak Attack | HADES | Attack Success Rate97.36 | 59 | |
| Visual Jailbreaking Attack | HADES | ASR72.66 | 32 | |
| Jailbreak Defense | HADES | ASR3.6 | 24 | |
| Multimodal Jailbreaking | HADES-Dataset | ASR (%)99.07 | 20 | |
| Jailbreak Attack Success Evaluation | HADES SD+TYPO | Attack Success Rate (ASR)2.1 | 18 | |
| Jailbreak Attack Success Evaluation | HADES SD | ASR2.8 | 18 | |
| Jailbreak Attack | HADES | Success Rate (Animal)66.67 | 18 | |
| Image-based Jailbreak | HADES OOD | Attack Success Rate (ASR)80.02 | 16 | |
| Jailbreak Attack | HADES All categories | ASR18.93 | 15 | |
| Jailbreak Attack | HADES Violence | ASR0.3 | 15 | |
| Jailbreak Attack | HADES Animals | ASR3.33 | 15 | |
| Jailbreak Attack | HADES Financial | ASR95.33 | 15 | |
| Jailbreak Attack | HADES Privacy | ASR95.33 | 15 | |
| Jailbreak Attack | HADES Self-harm | ASR5.33 | 15 | |
| Jailbreak Attack | HADES (test) | Self-harm Success Rate96 | 15 | |
| Multimodal Adversarial Attack | HADES | Success Rate (Animal)91.33 | 6 | |
| Safety Evaluation | HADES-Dataset | HADES Score43.67 | 4 |