Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multimodal Safety Evaluation on ChatGPT image input safety evaluations
Loading...
98.9
Hate Safety
GPT-4o
93.284
94.742
96.2
97.658
Dec 19, 2025
Hate Safety
Extremism Safety
Illicit Content Safety
Attack Planning Safety
Self-Harm Safety
Erotic Content Safety
Updated 4d ago
Evaluation Results
Method
Method
Links
Hate Safety
Extremism Safety
Illicit Content Safety
Attack Planning Safety
Self-Harm Safety
Erotic Content Safety
GPT-4o
2025.12
98.9
96.4
94.6
95.6
98
99.5
gpt-5-main
2025.12
98.6
99.1
98.6
100
99.7
99.4
gpt-5-thinking
2025.12
96.8
98
98.8
100
99.6
99.4
OpenAI o3
2025.12
93.5
96.2
97.2
98
98.2
98.7
Feedback
Search any
task
Search any
task