Backdoor Attacks Against Dataset Distillation

About

Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.

Yugeng Liu, Zheng Li, Michael Backes, Yun Shen, Yang Zhang• 2023

Related benchmarks

Task	Dataset	Result
Backdoor Attack	CIFAR10	Attack Success Rate100	158
Backdoor Attack	FMNIST	ASR100	75
Backdoor Attack in Dataset Condensation	CIFAR10	Clean Trigger Accuracy (CTA)73.92	43
Backdoor Attack in Dataset Condensation	SVHN	CTA88.58	43
Backdoor Attack in Dataset Condensation	FMNIST	CTA88.4	43
Backdoor Attack in Dataset Condensation	Tiny-ImageNet	CTA51.2	43
Backdoor Attack Stealthiness Evaluation	CIFAR10	SSIM0.18	40
Backdoor Attack in Dataset Condensation	STL10	CTA72.92	31
Backdoor Attack	SVHN	Attack Success Rate100	27
Backdoor Attack	STL10	Attack Success Rate (ASR)100	21

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord