CONTaiNER: Few-Shot Named Entity Recognition via Contrastive Learning
About
Named Entity Recognition (NER) in Few-Shot setting is imperative for entity tagging in low resource domains. Existing approaches only learn class-specific semantic features and intermediate representations from source domains. This affects generalizability to unseen target domains, resulting in suboptimal performances. To this end, we present CONTaiNER, a novel contrastive learning technique that optimizes the inter-token distribution distance for Few-Shot NER. Instead of optimizing class-specific attributes, CONTaiNER optimizes a generalized objective of differentiating between token categories based on their Gaussian-distributed embeddings. This effectively alleviates overfitting issues originating from training domains. Our experiments in several traditional test domains (OntoNotes, CoNLL'03, WNUT '17, GUM) and a new large scale Few-Shot NER dataset (Few-NERD) demonstrate that on average, CONTaiNER outperforms previous methods by 3%-13% absolute F1 points while showing consistent performance trends, even in challenging scenarios where previous approaches could not achieve appreciable performance.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Named Entity Recognition | CoNLL 2003 (test) | -- | 539 | |
| Named Entity Recognition | CoNLL 03 | -- | 102 | |
| Named Entity Recognition | Wnut 2017 | -- | 79 | |
| Named Entity Recognition | WNUT 2017 (test) | -- | 63 | |
| Named Entity Recognition | Few-NERD INTER 1.0 (test) | Average F161.83 | 62 | |
| Named Entity Recognition | FewNERD INTRA | F1 Score57.83 | 47 | |
| Few-shot Named Entity Recognition | FewNERD Intra 1.0 | F1 Score53.7 | 44 | |
| Few-shot Named Entity Recognition | Few-NERD Intra (test) | F1 Score53.7 | 40 | |
| Named Entity Recognition | NCBI-disease (test) | -- | 40 | |
| Named Entity Recognition | GUM | Micro F124.4 | 36 |