Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Jailbreak evaluation on violence prompts
Loading...
99.9
Non-Unsafe Rate
gpt-5-thinking
94.596
95.973
97.35
98.727
Dec 19, 2025
Non-Unsafe Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Non-Unsafe Rate
gpt-5-thinking
2025.12
99.9
OpenAI o3
2025.12
99.2
GPT-4o
2025.12
95.5
gpt-5-main
2025.12
94.8
Feedback
Search any
task
Search any
task