How to Prove Your Model Belongs to You: A Blind-Watermark based Framework to Protect Intellectual Property of DNN

About

Deep learning techniques have made tremendous progress in a variety of challenging tasks, such as image recognition and machine translation, during the past decade. Training deep neural networks is computationally expensive and requires both human and intellectual resources. Therefore, it is necessary to protect the intellectual property of the model and externally verify the ownership of the model. However, previous studies either fail to defend against the evasion attack or have not explicitly dealt with fraudulent claims of ownership by adversaries. Furthermore, they can not establish a clear association between the model and the creator's identity. To fill these gaps, in this paper, we propose a novel intellectual property protection (IPP) framework based on blind-watermark for watermarking deep neural networks that meet the requirements of security and feasibility. Our framework accepts ordinary samples and the exclusive logo as inputs, outputting newly generated samples as watermarks, which are almost indistinguishable from the origin, and infuses these watermarks into DNN models by assigning specific labels, leaving the backdoor as the basis for our copyright claim. We evaluated our IPP framework on two benchmark datasets and 15 popular deep learning models. The results show that our framework successfully verifies the ownership of all the models without a noticeable impact on their primary task. Most importantly, we are the first to successfully design and implement a blind-watermark based framework, which can achieve state-of-art performances on undetectability against evasion attack and unforgeability against fraudulent claims of ownership. Further, our framework shows remarkable robustness and establishes a clear association between the model and the author's identity.

Zheng Li, Chengyu Hu, Yang Zhang, Shanqing Guo• 2019

Related benchmarks

Task	Dataset	Result
Image Classification	GTSRB (test)	Accuracy (Clean)76.63	94
Model Extraction Attack	CIFAR10	Acc62.09	35
Watermark Detection	GTSRB	AccLoss12.42	14
Watermark Detection	CIFAR10	AccLoss8.3	14
Watermark Detection	VGGFace	AccLoss7.13	14
Model Extraction Attack Robustness	GTSRB	Accuracy14.5	14
Model Extraction Attack Robustness	VGGFace	Acc16.6	14
Watermark Detection	CIFAR100	AccLoss5.6	14
Face Recognition	VGG-Face (test)	Accuracy51.59	10
Object Classification	CIFAR100 (test)	Accuracy64.51	8

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord