| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Target Response Induction | VLGuard | ASR0.1 | 48 | |
| Safety Evaluation | VLGuard | ASR0.23 | 27 | |
| Safety and Helpfulness Evaluation | VLGuard | Safety Score88.48 | 18 | |
| Vision-text safety classification | VLGuard | AUPRC (Prompt)0.8843 | 9 | |
| Unsafe content detection | VLGuard | F1 Score79.3 | 9 | |
| Multimodal Safety Evaluation | VLGuard (test) | Accuracy86.78 | 6 | |
| Multimodal Jailbreaking | VLGuard Unsafe (OOD) | ASR66.7 | 6 | |
| Over-Prudence Evaluation | VLGuard | RR (Before)4.48 | 6 | |
| Jailbreak Attack | VLGuard Safe | Attack Success Rate (ASR)8.44 | 5 | |
| Jailbreak Attack | VLGuard Image Unsafe | ASR52.49 | 5 | |
| Jailbreak Attack | VLGuard Text Unsafe | ASR34.59 | 5 | |
| Jailbreak Attack | VLGuard (All) | ASR17.88 | 5 | |
| Safety Robustness | VLGuard Unsafe | Attack Success Rate13.4 | 4 | |
| Safety Robustness | VLGuard Safe_Unsafe | Attack Success Rate14.9 | 4 |