Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Jailbreak Defense on Decoding MaliciousInstruct

1ASR

JPU

0.285.141014.86Jan 6, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
1
2026.01
4
2026.01
6
2026.01
7
2026.01
7
2026.01
7
2026.01
7
2026.01
8
2026.01
8
2026.01
8
2026.01
9
2026.01
10
2026.01
17
2026.01
19