Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multimodal Content Moderation on UnsafeBench Sexual Text+Visual
Loading...
81.08
Accuracy
KidsNanny
53.8112
60.8906
67.97
75.0494
Mar 17, 2026
Accuracy
F1 Score
Precision
Recall
Inference Latency (ms)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
F1 Score
Precision
Recall
Inference Latency (ms)
KidsNanny
Regime=2, Stage=1+2
2026.03
81.08
85.11
80.46
90.32
115
LlavaGuard
Regime=2
2026.03
76.45
79.18
84.06
74.84
4,138
ShieldGemma-2
Regime=2
2026.03
54.86
51.67
72.94
40
1,136
Feedback
Search any
task
Search any
task