Semantic Specialisation of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints
About
We present Attract-Repel, an algorithm for improving the semantic quality of word vectors by injecting constraints extracted from lexical resources. Attract-Repel facilitates the use of constraints from mono- and cross-lingual resources, yielding semantically specialised cross-lingual vector spaces. Our evaluation shows that the method can make use of existing cross-lingual lexicons to construct high-quality vector spaces for a plethora of different languages, facilitating semantic transfer from high- to lower-resource ones. The effectiveness of our approach is demonstrated with state-of-the-art results on semantic similarity datasets in six languages. We next show that Attract-Repel-specialised vectors boost performance in the downstream task of dialogue state tracking (DST) across multiple languages. Finally, we show that cross-lingual vector spaces produced by our algorithm facilitate the training of multilingual DST models, which brings further performance improvements.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Dialogue State Tracking | WOZ 2.0 (test) | Joint Goal Accuracy81.7 | 65 | |
| Word Similarity | SimLex999 (test) | Spearman Correlation0.781 | 30 | |
| Word Similarity | SimVerb-3500 (test) | Spearman Correlation0.761 | 27 | |
| Lexical Text Simplification | LS dataset standard (test) | Accuracy69.8 | 12 | |
| Word Similarity | English SimLex-999 Disjoint setting (test) | Spearman's Rho0.414 | 12 | |
| Word Similarity | English SimVerb-3500 Disjoint setting (test) | Spearman's Rho0.28 | 12 |