Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LlavaGuard

Benchmarks

Task NameDataset NameSOTA ResultTrend
Policy ResponsivenessLlavaGuard v1 (test)
PER98.75
20
Unsafe content detectionLlavaGuard
Accuracy82
14
Visual Compliance VerificationLlavaGuard 1290 samples (test)
Unsafe F1 Score93
13
Prompt ClassificationLlavaGuard Image Prompt
F1 Score0.752
7
Out-of-Taxonomy Risk DetectionLlavaGuard
F1 Score66.87
4
OOD safety category inference (Stage 2)LlavaGuard
Reward Mean13.28
4
Showing 6 of 6 rows