Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

About

Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks. Recent studies, however, show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks. We aim to address this problem from an information-theoretic perspective, and propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models. InfoBERT contains two mutual-information-based regularizers for model training: (i) an Information Bottleneck regularizer, which suppresses noisy mutual information between the input and the feature representation; and (ii) a Robust Feature regularizer, which increases the mutual information between local robust features and global features. We provide a principled way to theoretically analyze and improve the robustness of representation learning for language models in both standard and adversarial training. Extensive experiments demonstrate that InfoBERT achieves state-of-the-art robust accuracy over several adversarial datasets on Natural Language Inference (NLI) and Question Answering (QA) tasks. Our code is available at https://github.com/AI-secure/InfoBERT.

Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu• 2020

Related benchmarks

TaskDatasetResultRank
Natural Language InferenceSNLI
Accuracy93.3
174
Text ClassificationAGNews
Clean Accuracy94.81
118
Natural Language InferenceMNLI
Accuracy (matched)90.7
80
Text ClassificationIMDB (test)
CA92
79
Sentiment AnalysisSST-2 (test)
Clean Accuracy92.9
50
Sentiment AnalysisIMDB (test)
Clean Accuracy (%)94.18
37
Text ClassificationIMDB
Clean Accuracy95.2
32
Natural Language InferenceANLI (test)
Overall Score58.3
28
Natural Language InferenceQNLI (test)--
27
Text ClassificationAGNews (test)
Accuracy (Clean)95.5
15
Showing 10 of 29 rows

Other info

Code

Follow for update