Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Unsafe Instruction Mitigation on Libero Harm (test)
Loading...
7.8
ASR
SAFE-Dict
4.724
25.487
46.25
67.013
Feb 2, 2026
ASR
Updated 1mo ago
Evaluation Results
Method
Method
Links
ASR
SAFE-Dict
Setting=SAFE-Dict
2026.02
7.8
Prompt-based Safety
Setting=Prompt-based S...
2026.02
41.2
Default (no defense)
Setting=no defense
2026.02
84.7
Feedback
Search any
task
Search any
task