| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MM-SafetyBench SD + TYPO + SD_TYPO (test) | DefenSee | ASR Score0.08 | 8 | 4d ago | |
| multimodal safety dataset | ASR0.13 | 6 | 4d ago | ||
| Image input safety evaluation set | gpt-5-thinking-nano | Hate Safety Acc98.6 | 4 | 4d ago | |
| ChatGPT image input safety evaluations | Hate Safety98.9 | 4 | 4d ago | ||
| MM-SafeBench | Forbidden Statements ASR1.04 | 4 | 4d ago | ||
| SafeBench | FS ASR3.26 | 4 | 4d ago | ||
| GOAT (test) | OSGA | Misogyny Accuracy56.9 | 2 | 4d ago |