Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

System Prompt Extraction on Instruction Hierarchy (test)

99.7Attack Success Rate (Realistic User)

OpenAI o3

88.05291.07694.197.124Dec 19, 2025
Updated 3mo ago

Evaluation Results

MethodLinks
2025.12
99.798.298.2
9999.199.1
2025.12
88.59378.9
2025.12
88.582.556.1