| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| AegisSafety (test) | LLaMA Guard 3 | F1 Score99.5 | 41 | 2d ago | |
| Text & Image Benchmarks Average | GuardReasoner-Omni 2B | F1 Score83.84 | 19 | 3mo ago | |
| Combined Prompt Harmfulness Suite (ToxicChat, HarmBench, OpenAI Mod, Aegis SafetyTest, WildGuard Test) | GuardReasoner | Macro Avg Harmfulness Detection Rate84.4 | 17 | 5d ago | |
| HarmVideo | GuardReasoner-Omni 4B | F1 Score95.5 | 7 | 3mo ago | |
| FVC | GuardReasoner-Omni 4B | F1 Score67.86 | 7 | 3mo ago | |
| XD-Violence | GuardReasoner-Omni 2B | F1 Score96.82 | 7 | 3mo ago | |
| UCF-crime | GuardReasoner-Omni 4B | F1 Score91.67 | 7 | 3mo ago | |
| Video Benchmarks Average | GuardReasoner-Omni 4B | F1 Score94.84 | 5 | 3mo ago | |
| HarmTextVideo | GuardReasoner-Omni 4B | F1 Score99.47 | 5 | 3mo ago |