Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multimodal Safety Evaluation on VLGuard (test)
Loading...
86.78
Accuracy
LLaVAShield-7B
41.5192
53.2696
65.02
76.7704
Sep 30, 2025
Accuracy
Recall
Precision
F1 Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Recall
Precision
F1 Score
LLaVAShield-7B
Model size=7B
2025.09
86.78
98.7
83.64
90.55
GPT-5-mini
Variant=mini
2025.09
84.47
76.8
98.71
86.39
Gemini-2.5-Pro
Variant=Pro
2025.09
78.69
71.2
94.18
81.09
Llama-Guard-4-12B
Model size=12B
2025.09
66.56
48.1
99.59
64.87
Qwen2.5-VL-7B-Instruct
Model size=7B, Variant...
2025.09
50.77
23.3
100
37.79
InternVL3-8B
Model size=8B
2025.09
43.26
11.6
100
20.79
Feedback
Search any
task
Search any
task