Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Optimizing Bi-Encoder for Named Entity Recognition via Contrastive Learning

About

We present a bi-encoder framework for named entity recognition (NER), which applies contrastive learning to map candidate text spans and entity types into the same vector representation space. Prior work predominantly approaches NER as sequence labeling or span classification. We instead frame NER as a representation learning problem that maximizes the similarity between the vector representations of an entity mention and its type. This makes it easy to handle nested and flat NER alike, and can better leverage noisy self-supervision signals. A major challenge to this bi-encoder formulation for NER lies in separating non-entity spans from entity mentions. Instead of explicitly labeling all non-entity spans as the same class $\texttt{Outside}$ ($\texttt{O}$) as in most prior methods, we introduce a novel dynamic thresholding loss. Experiments show that our method performs well in both supervised and distantly supervised settings, for nested and flat NER alike, establishing new state of the art across standard datasets in the general domain (e.g., ACE2004, ACE2005) and high-value verticals such as biomedicine (e.g., GENIA, NCBI, BC5CDR, JNLPBA). We release the code at github.com/microsoft/binder.

Sheng Zhang, Hao Cheng, Jianfeng Gao, Hoifung Poon• 2022

Related benchmarks

TaskDatasetResultRank
Named Entity RecognitionCoNLL 2003 (test)
F1 Score93.33
539
Nested Named Entity RecognitionACE 2004 (test)
F1 Score89.7
166
Nested Named Entity RecognitionACE 2005 (test)
F1 Score90
153
Nested Named Entity RecognitionGENIA (test)
F1 Score80.8
140
Named Entity RecognitionBC5CDR (test)
Macro F1 (span-level)91.9
80
Named Entity RecognitionBC5-chem BLURB (test)
F1 Score95
5
Named Entity RecognitionBC5-disease BLURB (test)
F1 Score88
5
Named Entity RecognitionNCBI BLURB (test)
F1 Score90.9
5
Named Entity RecognitionJNLPBA BLURB (test)
F1 Score80.3
5
Named Entity RecognitionBC2GM BLURB (test)
F1 Score84.6
5
Showing 10 of 10 rows

Other info

Code

Follow for update