GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge
About
Word Sense Disambiguation (WSD) aims to find the exact sense of an ambiguous word in a particular context. Traditional supervised methods rarely take into consideration the lexical resources like WordNet, which are widely utilized in knowledge-based methods. Recent studies have shown the effectiveness of incorporating gloss (sense definition) into neural networks for WSD. However, compared with traditional word expert supervised methods, they have not achieved much improvement. In this paper, we focus on how to better leverage gloss knowledge in a supervised neural WSD system. We construct context-gloss pairs and propose three BERT-based models for WSD. We fine-tune the pre-trained BERT model on SemCor3.0 training corpus and the experimental results on several English all-words WSD benchmark datasets show that our approach outperforms the state-of-the-art systems.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Word Sense Disambiguation | SensEval-3 (test) | F1 Score75.2 | 51 | |
| Word Sense Disambiguation | English All-Words Average (test) | -- | 19 | |
| Word Sense Disambiguation | Senseval-2 (SE2) 3.0 (test) | F1 Score77.7 | 16 | |
| Word Sense Disambiguation | All-Words WSD Concatenation SE2+SE3+SE13+SE15 3.0 (test) | Overall F177 | 16 | |
| Word Sense Disambiguation | SemEval-15 (SE15) 3.0 (test) | F1 Score80.4 | 16 | |
| Word Sense Disambiguation | SemEval-13 (SE13) 3.0 (test) | F1 Score76.1 | 16 | |
| Word Sense Disambiguation | SemEval-07 3.0 (dev) | F1 Score72.5 | 14 | |
| Word Sense Disambiguation | 42D | F1 Score45.7 | 9 | |
| Word Sense Disambiguation | S10 | F1 Score75.8 | 9 | |
| Word Sense Disambiguation | softEN | F1 Score77.1 | 9 |