MPNet: Masked and Permuted Pre-training for Language Understanding

About

BERT adopts masked language modeling (MLM) for pre-training and is one of the most successful pre-training models. Since BERT neglects dependency among predicted tokens, XLNet introduces permuted language modeling (PLM) for pre-training to address this problem. However, XLNet does not leverage the full position information of a sentence and thus suffers from position discrepancy between pre-training and fine-tuning. In this paper, we propose MPNet, a novel pre-training method that inherits the advantages of BERT and XLNet and avoids their limitations. MPNet leverages the dependency among predicted tokens through permuted language modeling (vs. MLM in BERT), and takes auxiliary position information as input to make the model see a full sentence and thus reducing the position discrepancy (vs. PLM in XLNet). We pre-train MPNet on a large-scale dataset (over 160GB text corpora) and fine-tune on a variety of down-streaming tasks (GLUE, SQuAD, etc). Experimental results show that MPNet outperforms MLM and PLM by a large margin, and achieves better results on these tasks compared with previous state-of-the-art pre-trained methods (e.g., BERT, XLNet, RoBERTa) under the same model setting. The code and the pre-trained models are available at: https://github.com/microsoft/MPNet.

Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu• 2020

Related benchmarks

Task	Dataset	Result
Natural Language Understanding	GLUE (dev)	SST-2 (Acc)96.7	529
Natural Language Understanding	GLUE (test)	SST-2 Accuracy96	416
Question Answering	SQuAD v1.1 (dev)	F1 Score92.7	380
Sentiment Analysis	IMDB (test)	Accuracy95.1	306
Question Answering	2Wiki	--	241
Question Answering	SQuAD v2.0 (dev)	F185.7	163
Sentiment Classification	IMDB (test)	Error Rate4.4	144
Emotion Recognition in Conversation	MELD (test)	--	143
Machine Reading Comprehension	RACE (test)	RACE Accuracy (Medium)79.7	111
Emotion Recognition	MELD (test)	--	89

Showing 10 of 70 rows

Other info

Code

Follow for update

@wizwand_team Discord