Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring

About

Deep Neural Networks have recently gained lots of success after enabling several breakthroughs in notoriously challenging problems. Training these networks is computationally expensive and requires vast amounts of training data. Selling such pre-trained models can, therefore, be a lucrative business model. Unfortunately, once the models are sold they can be easily copied and redistributed. To avoid this, a tracking mechanism to identify models as the intellectual property of a particular vendor is necessary. In this work, we present an approach for watermarking Deep Neural Networks in a black-box way. Our scheme works for general classification tasks and can easily be combined with current learning algorithms. We show experimentally that such a watermark has no noticeable impact on the primary task that the model is designed for and evaluate the robustness of our proposal against a multitude of practical attacks. Moreover, we provide a theoretical analysis, relating our approach to previous work on backdooring.

Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, Joseph Keshet• 2018

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100
Accuracy74.43
109
Image ClassificationGTSRB (test)
Accuracy (Clean)88.88
59
Model Extraction AttackCIFAR10
Acc65.13
35
Training Data Provenance VerificationCIFAR10
Avg AUC80.25
27
Ownership VerificationModel Extraction Setting Surrogate Models
AUC80.25
24
Image ClassificationCIFAR-10
Accuracy89.31
24
Ownership VerificationMNIST MT
Accuracy (Watermark Patch)24
14
Watermark DetectionGTSRB
AccLoss12.55
14
Model Extraction Attack RobustnessGTSRB
Accuracy16.55
14
Watermark DetectionCIFAR10
AccLoss8.53
14
Showing 10 of 38 rows

Other info

Follow for update