Chinese NER Using Lattice LSTM

About

We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.

Yue Zhang, Jie Yang• 2018

Related benchmarks

Task	Dataset	Result
Named Entity Recognition	OntoNotes	F1-score73.88	121
Named Entity Recognition	MSRA (test)	F1 Score93.18	63
Named Entity Recognition	OntoNotes 4.0 (test)	F1 Score73.88	55
Named Entity Recognition	RESUME	F1 Score94.5	52
Named Entity Recognition	Weibo (test)	Overall Score58.79	50
Named Entity Recognition	OntoNotes (test)	F1 Score73.88	34
Chinese Word Segmentation	PKU (test)	F195.8	32
Named Entity Recognition	MSRA	F1 Score93.18	29
Named Entity Recognition	Resume (test)	F1 Score94.46	28
Named Entity Recognition	Weibo	F1 Score58.8	27

Showing 10 of 28 rows

Other info

Code

Follow for update

@wizwand_team Discord