UNICORN: A Unified Backdoor Trigger Inversion Framework
About
The backdoor attack, where the adversary uses inputs stamped with triggers (e.g., a patch) to activate pre-planted malicious behaviors, is a severe threat to Deep Neural Network (DNN) models. Trigger inversion is an effective way of identifying backdoor models and understanding embedded adversarial behaviors. A challenge of trigger inversion is that there are many ways of constructing the trigger. Existing methods cannot generalize to various types of triggers by making certain assumptions or attack-specific constraints. The fundamental reason is that existing work does not consider the trigger's design space in their formulation of the inversion problem. This work formally defines and analyzes the triggers injected in different spaces and the inversion problem. Then, it proposes a unified framework to invert backdoor triggers based on the formalization of triggers and the identified inner behaviors of backdoor models from our analysis. Our prototype UNICORN is general and effective in inverting backdoor triggers in DNNs. The code can be found at https://github.com/RU-System-Software-and-Security/UNICORN.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Backdoor Detection | CIFAR-10 | Bd. Rate20 | 120 | |
| Robotic Manipulation | Extracting Tissue 30 random repositioning trials (test) | Completion Rate73.33 | 16 | |
| Robotic Manipulation | Lifting Cube 30 random repositioning trials (test) | CP0.8333 | 16 | |
| Robotic Manipulation | Grasping Fanta 30 random repositioning trials (test) | CP Success Rate83.33 | 16 | |
| Robotic Manipulation | Shaking Hand 30 random repositioning trials (test) | Completion Percentage76.67 | 16 | |
| Backdoor Detection | CIFAR-10 | Clean Detection Rate0.52 | 10 | |
| Backdoor Detection | CIFAR-10 all-to-one (test) | Clean Detection Rate52 | 10 | |
| Backdoor Detection (Seven-to-one attack) | GTSRB | BadNet Acc98 | 9 | |
| Backdoor Detection (All-to-one attack) | GTSRB | Clean Accuracy46 | 9 | |
| Backdoor Recovery | VLA Robotic Manipulation Tasks (Fanta, Cube, Tissue, Hand) (test) | Backdoor Recovery Fanta6.67 | 7 |