| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Safety and Utility evaluation suite (test) | Buffer-and-Reinforce | HS Score3 | 40 | 8d ago | |
| FINVAULT | B5 LF-A | Approve Rate50.5 | 12 | 7d ago | |
| BeaverTails & WildChat | Shadow-Level | Rule Adherence97.5 | 11 | 3mo ago | |
| XSTest | ASTRA | Utility Score98.8 | 9 | 2mo ago | |
| Safety and Utility Evaluation Suite | Qwen1.5-7B-Chat (Beam Search Pruning) | Unsafe Rate0.67 | 6 | 1mo ago | |
| XSTest | Safety Score9.89 | 5 | 23d ago | ||
| OR-Bench | SafeSearch w/o hf. | HarmR5.3 | 4 | 2mo ago | |
| Mistral-7B 8-bit quantized | Goal | Unsafe Rate0 | 3 | 1mo ago | |
| MaliciousGen & WildChat | Step-Level | Rule Adherence97.69 | 3 | 3mo ago | |
| MaliciousGen & LMSYS-Chat | Shadow-Level | Rule Score97.31 | 3 | 3mo ago | |
| BeaverTails & LMSYS-Chat | Shadow-Level | Rule Score97.88 | 3 | 3mo ago |