Share your thoughts, 1 month free Claude Pro on usSee more

System Prompt Extraction on Instruction Hierarchy (test)

99.7Attack Success Rate (Realistic User)

OpenAI o3

Updated 5mo ago

Evaluation Results

Method	Links
OpenAI o3 2025.12		99.7	98.2	98.2
gpt-5-thinking 2025.12		99	99.1	99.1
gpt-5-main 2025.12		88.5	93	78.9
GPT-4o 2025.12		88.5	82.5	56.1