| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Harmful prompt suite (test) | Refusal Rate95.5 | 15 | 4d ago | ||
| CCP Sensitive | Qwen3-Next-80B-A3B-Thinking | Reject Rate92.35 | 13 | 4d ago | |
| HarmBench | LLaMA-3.1 8B | Baseline Performance71 | 1 | 4d ago | |
| XSTest | LLaMA-3.1 8B | Refusal Rate (Before)61.27 | 1 | 4d ago |