| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Jailbreak Attack | Gemini Flash 2.5 (test) | ASR0 | 27 | |
| Black-box Adversarial Attack | Gemini 2.5-Pro | KMRa0.87 | 9 | |
| Jailbreaking | Gemini Pro 3 | ASR92.5 | 9 | |
| Adversarial Attack | Gemini 2.0 | CLIP Similarity (RN-50)0.2617 | 9 | |
| Image Captioning | Gemini Image Captioning Hard Criterion 1.5 | ASR81 | 8 | |
| Multi-shot video generation | Gemini 100 multi-shot video prompts 2.5 Pro | Intra-shot Consistency (Subject)0.825 | 8 | |
| Policy Corruption Evaluation | Gemini-2-Flash | Compliance3.65 | 5 | |
| Targeted Adversarial Attack | Gemini Flash 1.5 | Attack Success Rate (T1)58 | 4 | |
| Targeted Adversarial Attack | Gemini 2.0 | ASR520 | 4 |