| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Jailbreak Attack | JailbreakBench | ASR@100 | 132 | |
| Jailbreak Attack | JailbreakBench | ASR96.33 | 76 | |
| Jailbreak Attack | JailbreakBench (JBB) | ASR0 | 62 | |
| Jailbreak Attack | JailbreakBench | Attack Success Rate (ASR)96 | 40 | |
| Jailbreak Attack | JailbreakBench | ASR91 | 27 | |
| Thinking Collapse Analysis | JailbreakBench (JBB) | Thinking Collapse Rate0 | 25 | |
| Jailbreaking | JailbreakBench | Attack Success Rate (ASR)2 | 21 | |
| Jailbreaking | JailbreakBench | ASR (Detoxify)0 | 20 | |
| Jailbreak Defense | JailbreakBench | Rate of Response Safety70 | 20 | |
| Adversarial and Jailbreaking Attack Detection | JailbreakBench | AUROC0.8622 | 20 | |
| Red-teaming Attack Success Rate | JailbreakBench (test) | ASR (Vicuna)82 | 18 | |
| Jailbreak Safety | JailbreakBench | Reasoning Harmful Ratio0.3 | 17 | |
| Safety Evaluation | JailbreakBench | Harmful Rate0 | 16 | |
| Jailbreak Attack | JailbreakBench PAIR | Attack Success Rate (ASR)82 | 15 | |
| Jailbreak Defense | JailbreakBench Qwen-2.5-7B | ASR0 | 12 | |
| Jailbreak Defense | JailbreakBench (JBB) | FNR36.11 | 12 | |
| Jailbreak Detection | JailBreakBench Single Turn 35 | F1 Score98 | 10 | |
| Jailbreak Attack | JailbreakBench | Safe2Harm ASR (Answer)97.98 | 10 | |
| Jailbreak Attack | JailbreakBench GCG | ASR0.58 | 10 | |
| Jailbreak Attack | JailbreakBench (test) | ASR (Refusal)32 | 10 | |
| Jailbreak Safety Evaluation | JailbreakBench | Attack Success Rate (J)2 | 9 | |
| Jailbreak Defense | JailbreakBench Vicuna-13B | ASR0 | 7 | |
| Jailbreak Defense | JailbreakBench LLaMA-3-8B | ASR0 | 7 | |
| Jailbreak Defense Efficiency and Effectiveness | JailbreakBench Summary Aggregates | Avg ASR1 | 5 | |
| Jailbreak Attack | JailbreakBench 2024a | Average Latency (s)2.4 | 4 |