Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Defense against Harmful Fine-tuning on Backdoor Jailbreaking No Trigger

1.6Harm Score

Booster

1.5361.9682.42.832May 7, 2026
Updated 26d ago

Evaluation Results

MethodLinks
2026.05
1.6
2026.05
1.8
2026.05
2.1
2026.05
2.3
2026.05
2.7
2026.05
3.2