Semantic Specialisation of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints

About

We present Attract-Repel, an algorithm for improving the semantic quality of word vectors by injecting constraints extracted from lexical resources. Attract-Repel facilitates the use of constraints from mono- and cross-lingual resources, yielding semantically specialised cross-lingual vector spaces. Our evaluation shows that the method can make use of existing cross-lingual lexicons to construct high-quality vector spaces for a plethora of different languages, facilitating semantic transfer from high- to lower-resource ones. The effectiveness of our approach is demonstrated with state-of-the-art results on semantic similarity datasets in six languages. We next show that Attract-Repel-specialised vectors boost performance in the downstream task of dialogue state tracking (DST) across multiple languages. Finally, we show that cross-lingual vector spaces produced by our algorithm facilitate the training of multilingual DST models, which brings further performance improvements.

Nikola Mrk\v{s}i\'c, Ivan Vuli\'c, Diarmuid \'O S\'eaghdha, Ira Leviant, Roi Reichart, Milica Ga\v{s}i\'c, Anna Korhonen, Steve Young• 2017

Related benchmarks

Task	Dataset	Result
Dialogue State Tracking	WOZ 2.0 (test)	Joint Goal Accuracy81.7	65
Word Similarity	SimLex999 (test)	Spearman Correlation0.781	30
Word Similarity	SimVerb-3500 (test)	Spearman Correlation0.761	27
Lexical Text Simplification	LS dataset standard (test)	Accuracy69.8	12
Word Similarity	English SimLex-999 Disjoint setting (test)	Spearman's Rho0.414	12
Word Similarity	English SimVerb-3500 Disjoint setting (test)	Spearman's Rho0.28	12

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord