Share your thoughts, 1 month free Claude Pro on usSee more

Unsafe Instruction Mitigation on Libero Harm (test)

7.8ASR

SAFE-Dict

Updated 5mo ago

Evaluation Results

Method	Links
SAFE-Dict 2026.02		7.8
Prompt-based Safety 2026.02		41.2
Default (no defense) 2026.02		84.7