Moving Down the Long Tail of Word Sense Disambiguation with Gloss-Informed Biencoders

About

A major obstacle in Word Sense Disambiguation (WSD) is that word senses are not uniformly distributed, causing existing models to generally perform poorly on senses that are either rare or unseen during training. We propose a bi-encoder model that independently embeds (1) the target word with its surrounding context and (2) the dictionary definition, or gloss, of each sense. The encoders are jointly optimized in the same representation space, so that sense disambiguation can be performed by finding the nearest sense embedding for each target word embedding. Our system outperforms previous state-of-the-art models on English all-words WSD; these gains predominantly come from improved performance on rare senses, leading to a 31.1% error reduction on less frequent senses over prior work. This demonstrates that rare senses can be more effectively disambiguated by modeling their definitions.

Terra Blevins, Luke Zettlemoyer• 2020

Related benchmarks

Task	Dataset	Result
Word Sense Disambiguation	SensEval-3 (test)	F1 Score77.4	51
Word Sense Disambiguation	SemEval Task 7 (S7-T7) 2007 (test)	F1 Score74.5	29
Word Sense Disambiguation	hardEN	F1 Score0.00e+0	19
Word Sense Disambiguation	42D	F1 Score53.2	19
Word Sense Disambiguation	English All-Words Average (test)	--	19
Word Sense Disambiguation	FEWS (test)	--	19
Word Sense Disambiguation	SemEval-13 (SE13) 3.0 (test)	F1 Score79.7	16
Word Sense Disambiguation	Senseval-2 (SE2) 3.0 (test)	F1 Score79.4	16
Word Sense Disambiguation	SemEval-15 (SE15) 3.0 (test)	F1 Score81.7	16
Word Sense Disambiguation	All-Words WSD Concatenation SE2+SE3+SE13+SE15 3.0 (test)	Overall F179	16

Showing 10 of 25 rows

Other info

Follow for update

@wizwand_team Discord