Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Instruction Hierarchy Robustness on Tutor Jailbreak user (dev)

99Non-violation Rate

GPT-5-Mini-R

96.9297.469898.54Mar 11, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
99
2026.03
97