Semi-supervised sequence tagging with bidirectional language models

About

Pre-trained word embeddings learned from unlabeled text have become a standard component of neural network architectures for NLP tasks. However, in most cases, the recurrent network that operates on word-level representations to produce context sensitive representations is trained on relatively little labeled data. In this paper, we demonstrate a general semi-supervised approach for adding pre- trained context embeddings from bidirectional language models to NLP systems and apply it to sequence labeling tasks. We evaluate our model on two standard datasets for named entity recognition (NER) and chunking, and in both cases achieve state of the art results, surpassing previous systems that use other forms of transfer or joint learning with additional labeled data and task specific gazetteers.

Matthew E. Peters, Waleed Ammar, Chandra Bhagavatula, Russell Power• 2017

Related benchmarks

Task	Dataset	Result
Named Entity Recognition	CoNLL 2003 (test)	F1 Score92.2	556
Language Modeling	One Billion Word Benchmark (test)	Test Perplexity47.5	125
Chunking	CoNLL 2000 (test)	F1 Score96.37	88
Named Entity Recognition	Conll 2003	F1 Score91.93	86
Named Entity Recognition	NER (test)	F1 Score91.93	68
Chunking	Chunk (test)	F1 Score96.37	28
Entity recognition	SCIERC (test)	F1 Score62	20
Keyphrase Extraction	SemEval Task 10 ScienceIE 2017 (test)	F1 Score44	15
Entity recognition	SCIERC (dev)	Precision67.2	6
Span Identification	SemEval ScienceIE Task 10 2017 (test)	F1 Score55	3

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord