| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| StrongReject | Direct Attack Rate67 | 30 | 1mo ago | ||
| JBB-Behaviors (test) | MOSA | ASR0 | 24 | 1mo ago | |
| AdvBench | Recovery Alignment | PAIR ASR (GPT-4o)4 | 18 | 9d ago | |
| JBB-Behaviors | RA (ours) | ASR (PAIR, Guardrail Model)0.3 | 18 | 1mo ago | |
| StrongREJECT | ReSA-SFT | Safe Response Rate96.33 | 12 | 1mo ago | |
| Jailbreak Cipher, CodeChameleon (test) | DASE | Cipher Success Rate95.75 | 10 | 1mo ago | |
| DAN Static | FedDetox | ASR74.8 | 3 | 9d ago |