| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Macro Metrics Aggregate across LLMs | SafeCtrl-RL | Macro-P Safeguarded Safety-Quality Score81.8 | 17 | 8d ago | |
| Evil-Alpaca 3B L3.2 | Safety-Quality Score (P_safeguarded)94.5 | 17 | 8d ago | ||
| DeepSeek-R1-Distill-Qwen-1.5B | SafeCtrl-RL | P_safeguarded (Safety-Quality Score)89.8 | 17 | 8d ago | |
| DialoGPT large | SafeCtrl-RL | Safety-Quality Score0.647 | 17 | 8d ago | |
| BlackSheep Llama3.2-3B | raw_history | Safety-Quality Score (P_safeguarded)93.5 | 17 | 8d ago | |
| WMDP and GenHarm (test) | CAST | Refusal Rate (Chem)2.2 | 10 | 8d ago |