Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Jailbreak evaluation on illicit non-violent crime prompts
Loading...
99.5
Not Unsafe Rate
gpt-5-thinking
93.156
94.803
96.45
98.097
Dec 19, 2025
Not Unsafe Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Not Unsafe Rate
gpt-5-thinking
2025.12
99.5
OpenAI o3
2025.12
98.5
GPT-4o
2025.12
93.7
gpt-5-main
2025.12
93.4
Feedback
Search any
task
Search any
task