Share your thoughts, 1 month free Claude Pro on usSee more

Jailbreak evaluation on sexual-content prompts

99.5Non-Unsafe Rate

gpt-5-thinking

Updated 4mo ago

Evaluation Results

Method	Links
gpt-5-thinking 2025.12		99.5
OpenAI o3 2025.12		99.1
gpt-5-main 2025.12		96.7
GPT-4o 2025.12		96.1