Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multimodal Safety Evaluation on multimodal safety dataset
Loading...
0.13
ASR
Claude 3.5 Sonnet
0.1098
0.24615
0.3825
0.51885
Apr 13, 2025
ASR
FPR
Accuracy
Precision
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
ASR
FPR
Accuracy
Precision
F1 Score
Claude 3.5 Sonnet
2025.04
0.13
0.6944
0.59
0.56
0.68
GPT-4o Mini
2025.04
0.25
0.6444
0.55
0.54
0.63
Gemini 1.5 Pro
2025.04
0.255
0.2333
0.76
0.76
0.75
Claude 3 Haiku
2025.04
0.265
0.4722
0.63
0.61
0.67
GPT-4o
2025.04
0.44
0.1278
0.72
0.81
0.66
Gemini 1.5 Flash
2025.04
0.635
0.2389
0.56
0.6
0.46
Feedback
Search any
task
Search any
task