| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Alpaca & StrongReject benign & harmful | SafeChain | Refusal Rate (Benign)1 | 24 | 2mo ago | |
| PolyRefuse yo 1.0 | Ours (HRL) | Harmful Refusal Rate17.7 | 21 | 1d ago | |
| PolyRefuse 1.0 (si) | Ours (HRL + 32 LRL) | Harmful Refusal Rate90.2 | 21 | 1d ago | |
| PolyRefuse 1.0 (km) | Ours (HRL + 32 LRL) | Harmful Refusal Rate90.9 | 21 | 1d ago | |
| PolyRefuse 1.0 (my) | AdaSteer (HRL + 32 LRL) | Harmful Refusal Rate91.3 | 21 | 1d ago | |
| PolyRefuse am 1.0 | AdaSteer (HRL + 32 LRL) | Harmful Refusal Rate93.2 | 21 | 1d ago | |
| PolyRefuse 1.0 (sw) | Ours (HRL + 32 LRL) | Harmful Refusal Rate96.7 | 21 | 1d ago |