Chinese NER Using Lattice LSTM
About
We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.
Yue Zhang, Jie Yang• 2018
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Named Entity Recognition | OntoNotes | F1-score73.88 | 91 | |
| Named Entity Recognition | MSRA (test) | F1 Score93.18 | 63 | |
| Named Entity Recognition | OntoNotes 4.0 (test) | F1 Score73.88 | 55 | |
| Named Entity Recognition | RESUME | F1 Score94.5 | 52 | |
| Named Entity Recognition | Weibo (test) | Overall Score58.79 | 50 | |
| Named Entity Recognition | OntoNotes (test) | F1 Score73.88 | 34 | |
| Chinese Word Segmentation | PKU (test) | F195.8 | 32 | |
| Named Entity Recognition | MSRA | F1 Score93.18 | 29 | |
| Named Entity Recognition | Resume (test) | F1 Score94.46 | 28 | |
| Named Entity Recognition | F1 Score58.8 | 27 |
Showing 10 of 28 rows