Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring

About

Deep Neural Networks have recently gained lots of success after enabling several breakthroughs in notoriously challenging problems. Training these networks is computationally expensive and requires vast amounts of training data. Selling such pre-trained models can, therefore, be a lucrative business model. Unfortunately, once the models are sold they can be easily copied and redistributed. To avoid this, a tracking mechanism to identify models as the intellectual property of a particular vendor is necessary. In this work, we present an approach for watermarking Deep Neural Networks in a black-box way. Our scheme works for general classification tasks and can easily be combined with current learning algorithms. We show experimentally that such a watermark has no noticeable impact on the primary task that the model is designed for and evaluate the robustness of our proposal against a multitude of practical attacks. Moreover, we provide a theoretical analysis, relating our approach to previous work on backdooring.

Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, Joseph Keshet• 2018

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-100	Accuracy74.43	435
Image Classification	GTSRB (test)	Accuracy (Clean)88.88	94
Model Extraction Attack	CIFAR10	Acc65.13	35
Training Data Provenance Verification	CIFAR10	Avg AUC80.25	27
Ownership Verification	Model Extraction Setting Surrogate Models	AUC80.25	24
Image Classification	CIFAR-10	Accuracy89.31	24
Ownership Verification	MNIST MT	Accuracy (Watermark Patch)24	14
Watermark Detection	GTSRB	AccLoss12.55	14
Model Extraction Attack Robustness	GTSRB	Accuracy16.55	14
Watermark Detection	CIFAR10	AccLoss8.53	14

Showing 10 of 38 rows

Other info

Follow for update

@wizwand_team Discord