| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| StrongREJECT | PPO SafeSearch | Harm Rate0.2 | 29 | 26d ago | |
| WildTeaming | PPO SafeSearch w/o qr. & hf. | Harm Rate0.3 | 15 | 26d ago | |
| RRB | GRPO SafeSearch | HarmR1 | 15 | 26d ago | |
| XSTest | XSTest Score74.8 | 11 | 1mo ago | ||
| Phone-Harm Harm-150 | GPT-5 | HR0.4 | 8 | 6d ago |