Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Response Harmfulness Detection on SPA-VL-Eval
Loading...
74.73
F1 Score
GuardReasoner-Omni 2B
51.6836
57.6668
63.65
69.6332
Feb 3, 2026
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
GuardReasoner-Omni 2B
Model Category=VLM Gua...
2026.02
74.73
GuardReasoner-VL 7B
Model Category=VLM Gua...
2026.02
72.62
GuardReasoner-Omni 4B
Model Category=VLM Gua...
2026.02
72.13
GuardReasoner-VL 3B
Model Category=VLM Gua...
2026.02
72.01
LLaMA Guard 4 12B
Model Category=VLM Gua...
2026.02
52.57
Feedback
Search any
task
Search any
task