Robust Lottery Tickets for Pre-trained Language Models

About

Recent works on Lottery Ticket Hypothesis have shown that pre-trained language models (PLMs) contain smaller matching subnetworks(winning tickets) which are capable of reaching accuracy comparable to the original models. However, these tickets are proved to be notrobust to adversarial examples, and even worse than their PLM counterparts. To address this problem, we propose a novel method based on learning binary weight masks to identify robust tickets hidden in the original PLMs. Since the loss is not differentiable for the binary mask, we assign the hard concrete distribution to the masks and encourage their sparsity using a smoothing approximation of L0 regularization.Furthermore, we design an adversarial loss objective to guide the search for robust tickets and ensure that the tickets perform well bothin accuracy and robustness. Experimental results show the significant improvement of the proposed method over previous work on adversarial robustness evaluation.

Rui Zheng, Rong Bao, Yuhao Zhou, Di Liang, Sirui Wang, Wei Wu, Tao Gui, Qi Zhang, Xuanjing Huang• 2022

Related benchmarks

Task	Dataset	Result
Text Classification	AGNews	Clean Accuracy94.9	118
Text Classification	IMDB	Clean Accuracy93.8	32
Natural Language Inference	QNLI (test)	--	27
Text Classification	SST-2	Clean Accuracy90.9	6
Semantic Textual Similarity	QQP (test)	Clean Accuracy0.915	4
Natural Language Inference	MNLI (test)	Clean Accuracy84	4

Showing 6 of 6 rows

Other info

Code

Follow for update

@wizwand_team Discord