Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Instruction Following Safety on Instruction Hierarchy Phrase and Password Protection

91Phrase Protection Adherence (User)

OpenAI o1

73.3277.9182.587.09Dec 21, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.12
917010096
2024.12
74828569