| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Video Jailbreaking | MM-SafetyBench 1.0 (test) | Attack Success Rate96 | 48 | |
| Safety Evaluation | MM-SafetyBench | Average ASR0 | 42 | |
| MLLM Jailbreaking | MM-SafetyBench Physical Harm scenario | ASR6 | 15 | |
| Multimodal Jailbreak Defense | MM-SafetyBench (full) | ASR (Illegal Activity - S)1.03 | 12 | |
| Multimodal Safety Defense | MM-SafetyBench SD_TYPO | Average ASR12 | 10 | |
| Multimodal Safety Defense | MM-SafetyBench SD | Average ASR0.09 | 10 | |
| Harmful Rate Evaluation | MM-SafetyBench OCR (test) | Illegal Activity Rate0 | 10 | |
| Jailbreak Detection | MM-SafetyBench | AUROC99.18 | 9 | |
| Multimodal Safety Evaluation | MM-SafetyBench SD + TYPO + SD_TYPO (test) | ASR Score0.08 | 8 | |
| Jailbreaking Attack | MM-SafetyBench | Attack Success Rate (ASR)91.5 | 8 | |
| Multi-turn Jailbreaking Attack | MM-SafetyBench Turn 2 | ASR24.42 | 5 | |
| Safety Evaluation | MM-SafetyBench SD 1.0 | Illegal Activity Score48.3 | 5 | |
| Safety Evaluation | MM-SafetyBench H | Safety Score1 | 4 |