| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Harmful prompts dataset | Attack Success Rate97 | 49 | 4d ago | ||
| MultiJail | Self Defense | ASR (EN)13.02 | 18 | 4d ago | |
| AdvBench-x | SmoothLLM | ASR (English)7.94 | 18 | 4d ago | |
| VAJA | ZO-SPSA | Identity Attack Success Rate92.3 | 15 | 4d ago | |
| HEx-PHI (test) | CS-DJ | ASR Category 186.67 | 12 | 4d ago | |
| HarmBench (50 randomly sampled questions) | GPT-OSS-20B | ASR84 | 8 | 4d ago |