T-NER: An All-Round Python Library for Transformer-based Named Entity Recognition
About
Language model (LM) pretraining has led to consistent improvements in many NLP downstream tasks, including named entity recognition (NER). In this paper, we present T-NER (Transformer-based Named Entity Recognition), a Python library for NER LM finetuning. In addition to its practical utility, T-NER facilitates the study and investigation of the cross-domain and cross-lingual generalization ability of LMs finetuned on NER. Our library also provides a web app where users can get model predictions interactively for arbitrary text, which facilitates qualitative model evaluation for non-expert programmers. We show the potential of the library by compiling nine public NER datasets into a unified format and evaluating the cross-domain and cross-lingual performance across the datasets. The results from our initial experiments show that in-domain performance is generally competitive across datasets. However, cross-domain generalization is challenging even with a large pretrained LM, which has nevertheless capacity to learn domain-specific features if fine-tuned on a combined dataset. To facilitate future research, we also release all our LM checkpoints via the Hugging Face model hub.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Named Entity Recognition | CoNLL 2003 (test) | -- | 539 | |
| Named Entity Recognition | OntoNotes 5.0 (test) | -- | 90 | |
| Named Entity Recognition | WNUT 2017 (test) | -- | 63 | |
| Named Entity Recognition | Finnish (test) | -- | 4 | |
| Named Entity Recognition | WikiAnn ja (test) | Type-aware F1 Score86.5 | 3 | |
| Named Entity Recognition | BioNLP 2004 (test) | Type-aware F174.3 | 3 | |
| Named Entity Recognition | BioCreative V (test) | Type-aware F10.886 | 3 | |
| Named Entity Recognition | WikiAnn en (test) | Type-aware F184 | 3 | |
| Named Entity Recognition | WikiAnn ru (test) | Type-aware F190 | 3 | |
| Named Entity Recognition | MIT Restaurant (test) | Type-aware F10.796 | 2 |