Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

NEO-KD: Knowledge-Distillation-Based Adversarial Training for Robust Multi-Exit Neural Networks

About

While multi-exit neural networks are regarded as a promising solution for making efficient inference via early exits, combating adversarial attacks remains a challenging problem. In multi-exit networks, due to the high dependency among different submodels, an adversarial example targeting a specific exit not only degrades the performance of the target exit but also reduces the performance of all other exits concurrently. This makes multi-exit networks highly vulnerable to simple adversarial attacks. In this paper, we propose NEO-KD, a knowledge-distillation-based adversarial training strategy that tackles this fundamental challenge based on two key contributions. NEO-KD first resorts to neighbor knowledge distillation to guide the output of the adversarial examples to tend to the ensemble outputs of neighbor exits of clean data. NEO-KD also employs exit-wise orthogonal knowledge distillation for reducing adversarial transferability across different submodels. The result is a significantly improved robustness against adversarial attacks. Experimental results on various datasets/models show that our method achieves the best adversarial accuracy with reduced computation budgets, compared to the baselines relying on existing adversarial training or knowledge distillation techniques for multi-exit networks.

Seokil Ham, Jungwuk Park, Dong-Jun Han, Jaekyun Moon• 2023

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100 (test)--
3518
Image ClassificationCIFAR-100--
691
Image ClassificationTinyImageNet (test)--
499
Image ClassificationImageNet (test)
Top-1 Acc34.3
235
Image ClassificationCIFAR-100 (test)--
61
Robust ClassificationImageNet standard (test)
Top-1 Acc35.63
48
Robust ClassificationTiny ImageNet (test)
Top-1 Accuracy31.58
30
Image ClassificationMNIST (test)
Adversarial Accuracy97.42
20
Adversarial ClassificationMNIST (test)
Adversarial Accuracy96.62
20
Image ClassificationCIFAR-10 (test)
Accuracy (Exit 1)46.53
5
Showing 10 of 12 rows

Other info

Follow for update