| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Toxicity Detection | BAD | F1 Score80.8 | 11 | |
| Language Detoxification | BAD (test) | Toxicity Reduction37 | 10 | |
| Open domain dialogue | BAD | RSR53.7 | 9 | |
| Red Teaming against BB-3B | BAD | RSR66.4 | 9 | |
| Language Detoxification | BAD (val) | Toxicity Proportion11 | 7 | |
| Red Teaming | BAD Against Friend Chat (test) | RSR64.2 | 7 | |
| Red Teaming | BAD Against Marv (test) | RSR88.1 | 7 |