| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Llama-Guard | IPO | Harmfulness (%)82.14 | 36 | 1mo ago | |
| PKU-SafeRLHF 30K (test) | wDPO | Win Rate (WR)90.23 | 32 | 1mo ago | |
| Refusal Evaluation Dataset | Refusal Rate99 | 16 | 1mo ago | ||
| Safety Evaluation Dataset | DCR | Response Safe Rate (Llama Guard Model)81 | 5 | 1mo ago |