Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ProTransformer: Robustify Transformers via Plug-and-Play Paradigm

About

Transformer-based architectures have dominated various areas of machine learning in recent years. In this paper, we introduce a novel robust attention mechanism designed to enhance the resilience of transformer-based architectures. Crucially, this technique can be integrated into existing transformers as a plug-and-play layer, improving their robustness without the need for additional training or fine-tuning. Through comprehensive experiments and ablation studies, we demonstrate that our ProTransformer significantly enhances the robustness of transformer models across a variety of prediction tasks, attack mechanisms, backbone architectures, and data domains. Notably, without further fine-tuning, the ProTransformer consistently improves the performance of vanilla transformers by 19.5%, 28.3%, 16.1%, and 11.4% for BERT, ALBERT, DistilBERT, and RoBERTa, respectively, under the classical TextFooler attack. Furthermore, ProTransformer shows promising resilience in large language models (LLMs) against prompting-based attacks, improving the performance of T5 and LLaMA by 24.8% and 17.8%, respectively, and enhancing Vicuna by an average of 10.4% against the Jailbreaking attack. Beyond the language domain, ProTransformer also demonstrates outstanding robustness in both vision and graph domains.

Zhichao Hou, Weizhi Gao, Yuchen Shen, Feiyi Wang, Xiaorui Liu• 2024

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-10 (test)
Accuracy98.4
3381
Node ClassificationCiteseer (test)
Accuracy0.734
729
Node ClassificationCora-ML
Accuracy84.6
228
Text ClassificationSST-2 (test)
Accuracy95
185
Adversarial RobustnessCIFAR-10 (test)--
76
Jailbreak AttackBehaviours
ASR0.9
69
Jailbreak DefenseBehaviours (test)
ASR0.9
44
Sentiment AnalysisIMDB (test)
Clean Accuracy (%)93.6
37
Topic ClassificationAGNews
Clean Acc94.2
16
Showing 9 of 9 rows

Other info

Code

Follow for update