| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| HarmBench | Cres. | Attack Success Rate (ASR)100 | 557 | 8d ago | |
| AdvBench | ArtPrompt | AASR8,712 | 271 | 12d ago | |
| StrongREJECT | EnumAttack | Attack Success Rate99.4 | 262 | 12d ago | |
| SafeBench | ArtPrompt | ASR0 | 245 | 14d ago | |
| JailbreakBench | DictAttack | ASR100 | 242 | 12d ago | |
| HarmBench (test) | TTV-GDI | ASRHB99.73 | 212 | 1d ago | |
| MaliciousInstruct | PiF | ASR100 | 161 | 2mo ago | |
| AdvBench | EnumAttack | ASR100 | 133 | 6d ago | |
| JailbreakBench | ASR@100 | 132 | 3mo ago | ||
| Malicious goals dataset (test) | FigStep | ASR0 | 99 | 2mo ago | |
| HADES | DMN | Attack Success Rate99.6 | 92 | 14d ago | |
| AdvBench (test) | R2J | ASR7,827 | 73 | 3mo ago | |
| Malicious Prompts English (test) | Caesar | ASR@5100 | 72 | 1mo ago | |
| Behaviours | PRO-VICUNA-HUBER | ASR0.9 | 69 | 3mo ago | |
| Advbench-M | Attack Success Rate (ASR%)0 | 64 | 1mo ago | ||
| Advbench subset | CodeAttack | ASR96 | 64 | 8d ago | |
| 159 harmful behaviors (test) | CrescendoX | ASR99.37 | 63 | 21d ago | |
| JailbreakBench (JBB) | ASR0 | 62 | 1mo ago | ||
| JailBreakV_28K | UJEM-KL | Attack Success Rate (ASR)88.32 | 57 | 21d ago | |
| JBB-Behaviors | TrojFill | Rule-Judge Score100 | 56 | 3mo ago | |
| SafeBench | FigStep-Pro | HF1.7 | 54 | 2mo ago | |
| GPTFuzz (test) | AutoDAN | ASR100 | 52 | 3mo ago | |
| RedTeam 2K | MAPA | ASR94.79 | 52 | 2mo ago | |
| AdvBench | PAIR | Attack Success Rate (ASR)0 | 48 | 1mo ago | |
| ShadowRisk | PAIR | ASR-KW100 | 48 | 3mo ago |