Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multimodal Content Moderation on UnsafeBench Sexual Text-Only
Loading...
81.82
Accuracy
KidsNanny
58.1808
64.3179
70.455
76.5921
Mar 17, 2026
Accuracy
F1 Score
Precision
Recall
Inference Time (ms)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
F1 Score
Precision
Recall
Inference Time (ms)
KidsNanny
Regime=2, Stage=1+2
2026.03
81.82
86.21
75.76
100
122
ShieldGemma-2
Regime=2
2026.03
59.09
70
60
84
1,136
LlavaGuard
Regime=2
2026.03
59.09
60.87
66.67
56
4,138
Feedback
Search any
task
Search any
task