| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| HarmBench | Cres. | Attack Success Rate (ASR)100 | 487 | 4d ago | |
| AdvBench | ArtPrompt | AASR8,712 | 263 | 18d ago | |
| MaliciousInstruct | PiF | ASR100 | 161 | 22d ago | |
| StrongREJECT | NeuroStrike | Attack Success Rate89.1 | 138 | 4d ago | |
| JailbreakBench | ASR@100 | 132 | 1mo ago | ||
| SafeBench | ArtPrompt | ASR0 | 128 | 18d ago | |
| Malicious goals dataset (test) | FigStep | ASR0 | 99 | 1mo ago | |
| JailbreakBench | Audio Narrative Attacks | ASR96.33 | 76 | 4d ago | |
| AdvBench (test) | R2J | ASR7,827 | 73 | 1mo ago | |
| Behaviours | PRO-VICUNA-HUBER | ASR0.9 | 69 | 1mo ago | |
| Advbench-M | Attack Success Rate (ASR%)0 | 64 | 3d ago | ||
| JailbreakBench (JBB) | ASR0 | 62 | 8d ago | ||
| HADES | MIDAS | Attack Success Rate97.36 | 59 | 1mo ago | |
| JBB-Behaviors | TrojFill | Rule-Judge Score100 | 56 | 1mo ago | |
| SafeBench | FigStep-Pro | HF1.7 | 54 | 1mo ago | |
| GPTFuzz (test) | AutoDAN | ASR100 | 52 | 1mo ago | |
| RedTeam 2K | MAPA | ASR94.79 | 52 | 1mo ago | |
| AdvBench | PAIR | Attack Success Rate (ASR)0 | 48 | 4d ago | |
| ShadowRisk | PAIR | ASR-KW100 | 48 | 1mo ago | |
| AdvBench 50 | FlipAttack | ASR (KW)100 | 48 | 3d ago | |
| Prefilling Attack 40 tokens | ASR (%)0 | 45 | 1mo ago | ||
| Prefilling Attack 20 tokens | ASR0.3 | 45 | 1mo ago | ||
| Prefilling Attack 10 tokens | ASR70.91 | 45 | 1mo ago | ||
| AdvBench 150 Harmful Behaviors | RA-LLM | ASR0 | 45 | 1mo ago | |
| JailbreakBench | A-Salami | Attack Success Rate (ASR)96 | 40 | 4d ago |