Neural Cross-Lingual Named Entity Recognition with Minimal Resources
About
For languages with no annotated resources, unsupervised transfer of natural language processing models such as named-entity recognition (NER) from resource-rich languages would be an appealing capability. However, differences in words and word order across languages make it a challenging problem. To improve mapping of lexical items across languages, we propose a method that finds translations based on bilingual word embeddings. To improve robustness to word order differences, we propose to use self-attention, which allows for a degree of flexibility with respect to word order. We demonstrate that these methods achieve state-of-the-art or competitive NER performance on commonly tested languages under a cross-lingual setting, with much lower resource requirements than past approaches. We also evaluate the challenges of applying these methods to Uyghur, a low-resource language.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Named Entity Recognition | CoNLL Spanish NER 2002 (test) | F1 Score78.14 | 98 | |
| Named Entity Recognition | CoNLL Dutch 2002 (test) | F1 Score80.98 | 87 | |
| Named Entity Recognition | CoNLL German 2003 (test) | F1 Score57.76 | 78 | |
| Named Entity Recognition | CoNLL NER 2002/2003 (test) | German F1 Score57.76 | 59 | |
| Named Entity Recognition | Spanish (test) | F1 Score86.26 | 15 | |
| Named Entity Recognition | Dutch (test) | F1 Score86.4 | 15 | |
| Named Entity Recognition | CoNLL de 2003 (test) | F1 Score73.65 | 12 | |
| Named Entity Recognition | English-to-Dutch en-nl | F1 Score71.25 | 12 | |
| Named Entity Recognition | English-to-Spanish en-es | F1 Score71.03 | 12 | |
| Named Entity Recognition | English-to-German en-de | F1 Score56.9 | 12 |