| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| XSTEST-RESP | GuardReasoner-Omni 4B | Response Harmfulness F195.48 | 34 | 4d ago | |
| HarmBench | GuardReasoner-Omni 4B | F1 Score87.61 | 23 | 4d ago | |
| BeaverTails | BeaverDam 7B | F1 Score89.9 | 18 | 4d ago | |
| HarmTextVideo | GuardReasoner-Omni 4B | F1 Score95.25 | 5 | 4d ago | |
| SPA-VL-Eval | GuardReasoner-Omni 2B | F1 Score74.73 | 5 | 4d ago |