| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| JBB | Jailbreak Rate0 | 70 | 4d ago | ||
| Sorry | Abliteration | Jailbreak Rate99.8 | 70 | 4d ago | |
| AdvBench | LATS | Avg Queries2.1 | 63 | 4d ago | |
| HarmBench Standard Behaviours (200 examples) | AutoDan | ASR0 | 48 | 4d ago | |
| AdvBench Ensemble configuration GPT-4o | ArtPrompt | Attack Success Rate (ASR)0 | 25 | 4d ago | |
| Jailbreak | MMLU65.5 | 20 | 4d ago | ||
| AdvBench Ensemble configuration Average | ArtPrompt | Harmfulness Score (HS)1.43 | 15 | 4d ago | |
| AdvBench Ensemble configuration Llama-3-70B | SATA-MLM | Harmfulness Score (HS)4.6 | 15 | 4d ago | |
| AdvBench Ensemble configuration Claude-v2 | ArtPrompt | Harmfulness Score (HS)1.08 | 15 | 4d ago | |
| Advbench-M Image + Text (test) | JOOD | HF (BE)980 | 13 | 4d ago | |
| SUDO (full) | ASR (%)0 | 5 | 4d ago |