Split-NER: Named Entity Recognition via Two Question-Answering-based Classifications

About

In this work, we address the NER problem by splitting it into two logical sub-tasks: (1) Span Detection which simply extracts entity mention spans irrespective of entity type; (2) Span Classification which classifies the spans into their entity types. Further, we formulate both sub-tasks as question-answering (QA) problems and produce two leaner models which can be optimized separately for each sub-task. Experiments with four cross-domain datasets demonstrate that this two-step approach is both effective and time efficient. Our system, SplitNER outperforms baselines on OntoNotes5.0, WNUT17 and a cybersecurity dataset and gives on-par performance on BioNLP13CG. In all cases, it achieves a significant reduction in training time compared to its QA baseline counterpart. The effectiveness of our system stems from fine-tuning the BERT model twice, separately for span detection and classification. The source code can be found at https://github.com/c3sr/split-ner.

Jatin Arora, Youngja Park• 2023

Related benchmarks

Task	Dataset	Result
Named Entity Recognition	Wnut 2017	--	91
Named Entity Recognition	OntoNotes 5.0	F1 Score90.9	90
Named Entity Recognition	WNUT 2017 (test)	--	63
Named Entity Recognition	CTIReports	Mention-Level F174.96	5
Named Entity Recognition	BioNLP13CG	Mention-level F186.75	5
Named Entity Recognition	BioNLP13CG (test)	--	4
Named Entity Recognition	CTIReports (test)	Training Latency (s)1.46e+3	3

Showing 7 of 7 rows

Other info

Code

Follow for update

@wizwand_team Discord