Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding

About

Coarse-grained linguistic information, such as named entities or phrases, facilitates adequately representation learning in pre-training. Previous works mainly focus on extending the objective of BERT's Masked Language Modeling (MLM) from masking individual tokens to contiguous sequences of n tokens. We argue that such contiguously masking method neglects to model the intra-dependencies and inter-relation of coarse-grained linguistic information. As an alternative, we propose ERNIE-Gram, an explicitly n-gram masking method to enhance the integration of coarse-grained information into pre-training. In ERNIE-Gram, n-grams are masked and predicted directly using explicit n-gram identities rather than contiguous sequences of n tokens. Furthermore, ERNIE-Gram employs a generator model to sample plausible n-gram identities as optional n-gram masks and predict them in both coarse-grained and fine-grained manners to enable comprehensive n-gram prediction and relation modeling. We pre-train ERNIE-Gram on English and Chinese text corpora and fine-tune on 19 downstream tasks. Experimental results show that ERNIE-Gram outperforms previous pre-training models like XLNet and RoBERTa by a large margin, and achieves comparable results with state-of-the-art methods. The source codes and pre-trained models have been released at https://github.com/PaddlePaddle/ERNIE.

Dongling Xiao, Yu-Kun Li, Han Zhang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang• 2020

Related benchmarks

TaskDatasetResultRank
Named Entity RecognitionOntoNotes 4.0 (test)
F1 Score80.96
55
Chinese Word SegmentationPKU (test)
F196.36
32
Chinese Word SegmentationMSRA (test)
F1 Score98.27
17
Named Entity RecognitionFinance (test)
F1 Score85.31
14
Chinese Word SegmentationCTB 6.0 (test)
F1 Score97.28
12
Part-of-Speech TaggingCTB 6.0 (test)
F1 Score94.93
11
Part-of-Speech TaggingUD 2 (test)
F1 Score95.16
11
Part-of-Speech TaggingUD1 (test)
F1 Score95.26
11
Named Entity RecognitionBook (test)
F1 Score77.19
10
Named Entity RecognitionNews (test)
F1 Score79.96
10
Showing 10 of 10 rows

Other info

Follow for update