| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Harmful Prompts Curated April 13, 2023 | Bad Bot Rate0 | 61 | 3mo ago | ||
| WildJailbreak | Performance Rate98.5 | 22 | 15d ago | ||
| Red Queen SET | Qwen 3.5 35B | Passed Count23 | 18 | 1mo ago | |
| JailBreak R1 | SInternal | Attack Success Rate (ASR)1.3 | 12 | 22d ago | |
| curated dataset (test) | BAD BOT Rate0 | 11 | 3mo ago | ||
| Multilingual Jailbreak Dataset (Evaluation set) | JSR2.3 | 10 | 15d ago | ||
| StrongReject | τ_trigger ⊕ PAP | ASR-J95.5 | 9 | 1mo ago | |
| Synthetic dataset (held-out) | Good Bot Rate100 | 8 | 3mo ago | ||
| sexual-content prompts | gpt-5-thinking | Non-Unsafe Rate99.5 | 4 | 3mo ago | |
| abuse, disinformation, hate prompts | gpt-5-thinking | Not Unsafe Rate99.9 | 4 | 3mo ago | |
| violence prompts | gpt-5-thinking | Non-Unsafe Rate99.9 | 4 | 3mo ago | |
| illicit non-violent crime prompts | gpt-5-thinking | Not Unsafe Rate99.5 | 4 | 3mo ago |