| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| BeaverTails bottom 30% uncertainty slice (test) | Geometry-Lite | AUROC85.1 | 70 | 13d ago | |
| BeaverTails, ToxicChat, PKU-SafeRLHF Hard (test) | MultiLayer-Linear | AUROC90.2 | 63 | 13d ago | |
| Full in-distribution (test) | MultiLayer-Linear | AUROC0.964 | 63 | 13d ago | |
| WildGuard | Qwen3Guard | Macro F1 Score90.6 | 47 | 2d ago | |
| OpenAI Moderation | SIREN | Macro F1 Score92.9 | 45 | 2d ago | |
| Aegis | WildGuard | Macro F189.78 | 25 | 1d ago | |
| Aegis 2.0 | SIREN | Macro F183.4 | 8 | 1mo ago |