Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction

About

In recent years there is a surge of interest in applying distant supervision (DS) to automatically generate training data for relation extraction (RE). In this paper, we study the problem what limits the performance of DS-trained neural models, conduct thorough analyses, and identify a factor that can influence the performance greatly, shifted label distribution. Specifically, we found this problem commonly exists in real-world DS datasets, and without special handing, typical DS-RE models cannot automatically adapt to this shift, thus achieving deteriorated performance. To further validate our intuition, we develop a simple yet effective adaptation method for DS-trained models, bias adjustment, which updates models learned over the source domain (i.e., DS training set) with a label distribution estimated on the target domain (i.e., test set). Experiments demonstrate that bias adjustment achieves consistent performance gains on DS-trained models, especially on neural models, with an up to 23% relative F1 improvement, which verifies our assumptions. Our code and data can be found at \url{https://github.com/INK-USC/shifted-label-distribution}.

Qinyuan Ye, Liyuan Liu, Maosen Zhang, Xiang Ren• 2019

Related benchmarks

TaskDatasetResultRank
Document-level Relation ExtractionDocRED (dev)
F1 Score51.66
231
Document-level Relation ExtractionDocRED (test)
F1 Score50.61
179
Relation ExtractionNYT (test)
F1 Score51.45
85
Relation ExtractionWiki-KBP (test)
F1 Score42.41
59
Showing 4 of 4 rows

Other info

Follow for update