Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SFT-based Poisoning

Benchmarks

Task NameDataset NameSOTA ResultTrend
Backdoor MitigationSFT-based Poisoning Word trigger
Clean Accuracy (CACC)96.7
18
Backdoor MitigationSFT-based Poisoning Phrase trigger
Clean Accuracy (CACC)95.7
18
Backdoor MitigationSFT-based Poisoning Long trigger
Clean Accuracy (CACC)94.8
18
Showing 3 of 3 rows