| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| RealToxicityPrompts challenging | RAD | Avg Toxicity (Max)6.2 | 46 | 1mo ago | |
| ToxTET | ToxTET Rate29.39 | 33 | 1mo ago | ||
| Toxicity Mitigation Task | Generation Speed (s/item)1.03 | 30 | 1mo ago | ||
| ToxicTop | DATG-P | Relevance0.458 | 30 | 1mo ago | |
| ToxicRandom | DATG-P | Relevance45.1 | 30 | 1mo ago | |
| REALTOXICITYPROMPTS | EIGENSHIFT + Targeted Subspace Intervention | Toxicity21.24 | 24 | 1mo ago | |
| RealToxicityPrompts 1k samples | PID-AcT | CLS Toxicity0.51 | 20 | 20d ago | |
| RealToxicityPrompts (test) | DGLM | Full Toxicity10.1 | 14 | 1mo ago | |
| RealToxicityPrompts (RTP) | CHaRS | CLS Tox Rate0.53 | 12 | 1mo ago | |
| ToxicTop (test) | Perplexity33.21 | 6 | 1mo ago | ||
| ToxicRandom (test) | Perplexity27.21 | 6 | 1mo ago | ||
| DailyDialog | RTR (Backdoor Non-toxic)1.16 | 3 | 15d ago | ||
| Specialized category Manually-designed jailbreak attacks | Optimus (CH) | RTR4.5 | 3 | 15d ago | |
| Toxic (test) | G | Toxicity Score56 | 2 | 1mo ago | |
| Specialized category Optimization-based jailbreak attacks | - | - | 0 | 15d ago |