Word Representations via Gaussian Embedding

About

Current work in lexical distributed representations maps each word to a point vector in low-dimensional space. Mapping instead to a density provides many interesting advantages, including better capturing uncertainty about a representation and its relationships, expressing asymmetries more naturally than dot product or cosine similarity, and enabling more expressive parameterization of decision boundaries. This paper advocates for density-based distributed embeddings and presents a method for learning representations in the space of Gaussian distributions. We compare performance on various word embedding benchmarks, investigate the ability of these embeddings to model entailment and other asymmetric relationships, and explore novel properties of the representation.

Luke Vilnis, Andrew McCallum• 2014

Related benchmarks

Task	Dataset	Result
Word Similarity	RG-65	Spearman Correlation0.69	41
Word Similarity	SimLex-999	Spearman Correlation25	31
Word Similarity	Mechanical Turk-771	Spearman ρ0.57	14
Word Similarity	MTurk-287	Spearman Correlation0.61	12
Word Similarity	WS-353 SIM	Spearman Correlation0.48	12
Word Similarity	RW-STANFORD	Spearman Correlation0.4	6
Word Similarity	WS-353 REL	Spearman Correlation0.61	6
Word Similarity	MC-30	Spearman Correlation0.59	6
Word Similarity	WS-353 ALL	Spearman Correlation0.53	6
Word Similarity	WS-YP-130	Spearman Correlation0.37	6

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord