| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| S-Eval Aattack | Attack Success Rate (ASR)92 | 72 | 1mo ago | ||
| TRIDENT CORE | TRIDENT-CORE | HPR7 | 38 | 1mo ago | |
| HarmBench (400 random samples) | Llama-2-7B-Chat (Original) | ASR0 | 18 | 1mo ago | |
| StealthGraph SG-Implicit | ASR91 | 12 | 1mo ago | ||
| AdvBench | Amnesia | ASR Success Rate86.3 | 9 | 1mo ago | |
| TRIDENT-EDGE | TRIDENT-EDGE | HPR5 | 7 | 1mo ago | |
| Five Safety Benchmarks AdvBench, HarmBench, HarmfulQ, JBBench, StrongReject | QwQ | ASR7.69 | 6 | 5d ago | |
| StealthGraph SG-Origin | ASR39.5 | 6 | 1mo ago | ||
| HarmfulQA | ASR16 | 6 | 1mo ago | ||
| Do-Not-Answer | ASR2.5 | 6 | 1mo ago | ||
| FigStep Average | SafeThink | Average ASR0.053 | 5 | 1mo ago |