Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Safety Evaluation on DeepInception (test)

78.5ASR

No Defense

-2.30818.67139.6560.629Jan 23, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.01
78.5
2026.01
73
2026.01
50.8
2026.01
43.2
2026.01
5.6
2026.01
0.8
2026.01
0.8