Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multimodal Safety Evaluation on Image input safety evaluation set
Loading...
98.6
Hate Safety Acc
gpt-5-thinking-nano
92.464
94.057
95.65
97.243
Dec 19, 2025
Hate Safety Acc
Extremism Safety Acc
Illicit Safety Acc
Attack Planning Safety Acc
Self-Harm Safety Acc
Erotic Safety Acc
Updated 4d ago
Evaluation Results
Method
Method
Links
Hate Safety Acc
Extremism Safety Acc
Illicit Safety Acc
Attack Planning Safety Acc
Self-Harm Safety Acc
Erotic Safety Acc
gpt-5-thinking-nano
2025.12
98.6
97.3
98.6
98.6
93.9
96.3
gpt-5-main-mini
2025.12
98.4
98.4
98.2
99.5
99.4
99.8
gpt-5-thinking-mini
2025.12
97.1
98.2
98.6
98.6
98.7
99.2
OpenAI o4-mini
2025.12
92.7
95
95.6
93.9
92.7
97.8
Feedback
Search any
task
Search any
task