Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Filtered Inner Product Projection for Crosslingual Embedding Alignment

About

Due to widespread interest in machine translation and transfer learning, there are numerous algorithms for mapping multiple embeddings to a shared representation space. Recently, these algorithms have been studied in the setting of bilingual dictionary induction where one seeks to align the embeddings of a source and a target language such that translated word pairs lie close to one another in a common representation space. In this paper, we propose a method, Filtered Inner Product Projection (FIPP), for mapping embeddings to a common representation space and evaluate FIPP in the context of bilingual dictionary induction. As semantic shifts are pervasive across languages and domains, FIPP first identifies the common geometric structure in both embeddings and then, only on the common structure, aligns the Gram matrices of these embeddings. Unlike previous approaches, FIPP is applicable even when the source and target embeddings are of differing dimensionalities. We show that our approach outperforms existing methods on the MUSE dataset for various language pairs. Furthermore, FIPP provides computational benefits both in ease of implementation and scalability.

Vin Sachidananda, Ziyi Yang, Chenguang Zhu• 2020

Related benchmarks

TaskDatasetResultRank
Bilingual Lexicon InductionPanLex-BLI HU-EU (test)
P@111.58
7
Bilingual Lexicon InductionPanLex-BLI EU-ET (test)
P@18.22
7
Bilingual Lexicon InductionGlavaš 1k translation pairs 2019
Acc (DE→*)37.7
6
Bilingual Lexicon InductionPanLex-BLI BG-CA (test)
Precision@134.29
6
Bilingual Lexicon InductionPanLex-BLI CA-HE (test)
P@120.63
6
Bilingual Lexicon InductionPanLex-BLI HE-BG (test)
Precision@126.38
6
Bilingual Lexicon InductionPanLex-BLI ET-HU (test)
P@130.3
6
Bilingual Lexicon InductionGlavaš 5k translation pairs 2019
Recall (DE->*)40.95
6
Showing 8 of 8 rows

Other info

Follow for update