Bottom-Up Constituency Parsing and Nested Named Entity Recognition with Pointer Networks
About
Constituency parsing and nested named entity recognition (NER) are similar tasks since they both aim to predict a collection of nested and non-crossing spans. In this work, we cast nested NER to constituency parsing and propose a novel pointing mechanism for bottom-up parsing to tackle both tasks. The key idea is based on the observation that if we traverse a constituency tree in post-order, i.e., visiting a parent after its children, then two consecutively visited spans would share a boundary. Our model tracks the shared boundaries and predicts the next boundary at each step by leveraging a pointer network. As a result, it needs only linear steps to parse and thus is efficient. It also maintains a parsing configuration for structural consistency, i.e., always outputting valid trees. Experimentally, our model achieves the state-of-the-art performance on PTB among all BERT-based models (96.01 F1 score) and competitive performance on CTB7 in constituency parsing; and it also achieves strong performance on three benchmark datasets of nested NER: ACE2004, ACE2005, and GENIA. Our code is publicly available at \url{https://github.com/sustcsonglin/pointer-net-for-nested}.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Nested Named Entity Recognition | ACE 2004 (test) | F1 Score86.94 | 166 | |
| Nested Named Entity Recognition | ACE 2005 (test) | F1 Score85.53 | 153 | |
| Nested Named Entity Recognition | GENIA (test) | F1 Score78.22 | 140 | |
| Constituent Parsing | PTB (test) | F196.48 | 127 | |
| Named Entity Recognition | ACE04 (test) | F1 Score86.94 | 36 | |
| Constituency Parsing | CTB 5.1 (test) | F1 Score92.41 | 25 | |
| Named Entity Recognition | ACE05 splits of Lu and Roth (test) | F1 Score85.53 | 14 | |
| Constituency Parsing | PTB (test) | Speed (Sents/s)855 | 12 | |
| Constituency Parsing | CTB7 (test) | -- | 3 |