Word Representations via Gaussian Embedding
About
Current work in lexical distributed representations maps each word to a point vector in low-dimensional space. Mapping instead to a density provides many interesting advantages, including better capturing uncertainty about a representation and its relationships, expressing asymmetries more naturally than dot product or cosine similarity, and enabling more expressive parameterization of decision boundaries. This paper advocates for density-based distributed embeddings and presents a method for learning representations in the space of Gaussian distributions. We compare performance on various word embedding benchmarks, investigate the ability of these embeddings to model entailment and other asymmetric relationships, and explore novel properties of the representation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Word Similarity | RG-65 | Spearman Correlation0.69 | 41 | |
| Word Similarity | SimLex-999 | Spearman Correlation25 | 31 | |
| Word Similarity | Mechanical Turk-771 | Spearman ρ0.57 | 8 | |
| Word Similarity | RW-STANFORD | Spearman Correlation0.4 | 6 | |
| Word Similarity | WS-353 REL | Spearman Correlation0.61 | 6 | |
| Word Similarity | MC-30 | Spearman Correlation0.59 | 6 | |
| Word Similarity | WS-353 ALL | Spearman Correlation0.53 | 6 | |
| Word Similarity | WS-YP-130 | Spearman Correlation0.37 | 6 | |
| Word Similarity | MEN 3k (train) | Spearman Correlation0.65 | 6 | |
| Word Similarity | MTurk-287 | Spearman Correlation0.61 | 6 |