| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| CivilComments sensitive attribute: MUSLIM (test) | Balanced Accuracy59.9 | 57 | 3mo ago | ||
| Average across WZ, DC, HX, OR | Harmonic F148.8 | 26 | 3mo ago | ||
| OR | Harmonic F149.7 | 26 | 3mo ago | ||
| HX | H.-F144.2 | 26 | 3mo ago | ||
| DC | ToxiGAN | Harmonic Mean F131 | 26 | 3mo ago | |
| WZ | Harmonic F173.4 | 26 | 3mo ago | ||
| ToxCMM | ToxVidLM | F1 Score94.35 | 24 | 3mo ago | |
| Toxigen | MAT-STEER | Accuracy60.41 | 22 | 3mo ago | |
| ToxiFrench Sbench | Class 0 Precision99 | 19 | 1mo ago | ||
| ToxiCN (test) | CITD | Accuracy91.47 | 19 | 12d ago | |
| COLD (test) | COLD | Accuracy94.33 | 19 | 12d ago | |
| Personification GPT-3 prompted (test) | V-REx | Loss0.69 | 16 | 3mo ago | |
| RealToxicity Prompts GPT-3 prompted (test) | V-REx | Loss0.61 | 16 | 3mo ago | |
| CivilComments (CC) (test) | gDRO | Worst-Group Accuracy79.66 | 13 | 3mo ago | |
| CivilComments WILDS | Fish | Worst-Group Accuracy75.3 | 11 | 1mo ago | |
| Jigsaw dataset | Rescue Rate44.2 | 9 | 1d ago | ||
| Toxicity Dataset (test) | CoGate-LSTM | Test Accuracy96 | 9 | 1mo ago | |
| CNTP (test) | Accuracy98.53 | 7 | 12d ago | ||
| SCCD (test) | CITD | Accuracy93.73 | 7 | 12d ago | |
| SWSR (test) | CITD | Accuracy91.33 | 7 | 12d ago | |
| DeToxy-B (test) | ToxiAlert | Balanced ACC72.29 | 6 | 16d ago | |
| EEUCA 2026 (val) | Macro F167 | 6 | 23d ago | ||
| Toxic | Precision66.6 | 6 | 1mo ago | ||
| Jigsaw (test) | CoGate-LSTM | Accuracy96 | 6 | 1mo ago | |
| Toxicity | AdvDemo + CW | Original Accuracy90.4 | 6 | 28d ago |