| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| StrongReject | gpt-5-thinking-nano | Jailbreak Resistance: Illicit Crime99.7 | 4 | 3mo ago | |
| DRA | DR-IRL | Refusal Rate64.92 | 3 | 3mo ago | |
| GCG | DR-IRL | Refusal Rate96.98 | 3 | 3mo ago | |
| AutoDAN | DR-IRL | Refusal Rate59 | 3 | 3mo ago | |
| TAP Attack | Buffer-and-Reinforce | Harmful Score20 | 2 | 8d ago | |
| GCG Attack | HS Score74.7 | 2 | 8d ago | ||
| Backdoor Attack | Buffer-and-Reinforce | Harmful Score (HS)8.8 | 2 | 8d ago |