Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval

About

Contrastive learning has been successfully used for retrieval of semantically aligned sentences, but it often requires large batch sizes or careful engineering to work well. In this paper, we instead propose a generative model for learning multilingual text embeddings which can be used to retrieve or score sentence pairs. Our model operates on parallel data in $N$ languages and, through an approximation we introduce, efficiently encourages source separation in this multilingual setting, separating semantic information that is shared between translations from stylistic or language-specific variation. We show careful large-scale comparisons between contrastive and generation-based approaches for learning multilingual text embeddings, a comparison that has not been done to the best of our knowledge despite the popularity of these approaches. We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval -- the last of which we introduce in this paper. Overall, our Variational Multilingual Source-Separation Transformer (VMSST) model outperforms both a strong contrastive and generative baseline on these tasks.

John Wieting, Jonathan H. Clark, William W. Cohen, Graham Neubig, Taylor Berg-Kirkpatrick• 2022

Related benchmarks

Task	Dataset	Result
Semantic Similarity	Semantic Similarity Cross-lingual XL	Pearson Correlation Coefficient0.791	24
Multi-task Evaluation	Aggregate All tasks (summary)	Score65.9	20
Bitext Mining	Tatoeba (full)	Accuracy85.4	12
Bitext Mining	BUCC (full)	F1 (Cosine Similarity)87.8	12
Question Retrieval	MKQA (full)	Retrieval Accuracy29.9	12
Semantic Similarity	Semantic Similarity English-only	Pearson's r74.6	12
Semantic Similarity	Semantic Similarity Cross-lingual same language XL s.	Pearson's r0.815	12
Question Retrieval	NQ (Natural Questions) (full)	Retrieval Accuracy40.8	12
Cross-lingual Semantic Similarity	XL (test)	Spearman's rho79.4	12
Cross-lingual Semantic Similarity	XL s. (test)	Spearman's Rho81.9	6

Showing 10 of 12 rows

Other info

Code

Follow for update

@wizwand_team Discord