Neural Cross-Lingual Named Entity Recognition with Minimal Resources

About

For languages with no annotated resources, unsupervised transfer of natural language processing models such as named-entity recognition (NER) from resource-rich languages would be an appealing capability. However, differences in words and word order across languages make it a challenging problem. To improve mapping of lexical items across languages, we propose a method that finds translations based on bilingual word embeddings. To improve robustness to word order differences, we propose to use self-attention, which allows for a degree of flexibility with respect to word order. We demonstrate that these methods achieve state-of-the-art or competitive NER performance on commonly tested languages under a cross-lingual setting, with much lower resource requirements than past approaches. We also evaluate the challenges of applying these methods to Uyghur, a low-resource language.

Jiateng Xie, Zhilin Yang, Graham Neubig, Noah A. Smith, Jaime Carbonell• 2018

Related benchmarks

Task	Dataset	Result
Named Entity Recognition	CoNLL Spanish NER 2002 (test)	F1 Score78.14	98
Named Entity Recognition	CoNLL Dutch 2002 (test)	F1 Score80.98	87
Named Entity Recognition	CoNLL German 2003 (test)	F1 Score57.76	78
Named Entity Recognition	CoNLL NER 2002/2003 (test)	German F1 Score57.76	59
Named Entity Recognition	Spanish (test)	F1 Score86.26	15
Named Entity Recognition	Dutch (test)	F1 Score86.4	15
Named Entity Recognition	CoNLL de 2003 (test)	F1 Score73.65	12
Named Entity Recognition	English-to-Dutch en-nl	F1 Score71.25	12
Named Entity Recognition	English-to-Spanish en-es	F1 Score71.03	12
Named Entity Recognition	English-to-German en-de	F1 Score56.9	12

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord