Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Repetition Improves Language Model Embeddings

About

Bidirectional models are considered essential for strong text embeddings. Recent approaches to adapt autoregressive language models (LMs) into strong text embedding models have largely had the requirement to modify the LM architecture to be bidirectional. We challenge this premise by introducing "echo embeddings" which converts autoregressive LMs into high quality text embedding models without changing the architecture or requiring fine-tuning. By repeating the input and extracting embeddings from the repeated tokens -- which have access to all original tokens -- echo embeddings improve over classical LM embeddings by over 5% in zero-shot settings. Our zero-shot embeddings nearly match those obtained by bidirectionally-converted LMs that undergo additional masked-language modeling training. Echo embeddings are also compatible with supervised fine-tuning, matching or outperforming bidirectionally-converted LMs in an apples-to-apples comparison, even with an identical compute budget during training and inference. Overall, repetition is a simple and effective strategy to circumvent the need for bidirectional attention in embedding models, paving the way towards a unified architecture for all NLP tasks.

Jacob Mitchell Springer, Suhas Kotha, Daniel Fried, Graham Neubig, Aditi Raghunathan• 2024

Related benchmarks

TaskDatasetResultRank
Semantic Textual SimilaritySTS tasks (STS12, STS13, STS14, STS15, STS16, STS-B, SICK-R) various (test)
STS12 Score59.36
412
Semantic Textual SimilaritySTS tasks (STS12, STS13, STS14, STS15, STS16, STS-B, SICK-R)
STS12 Score52.4
253
Text EmbeddingMTEB English v2
Mean Score45.74
107
Semantic Textual SimilaritySTS (Semantic Textual Similarity) 2012-2016 (test)
STS-12 Score50.43
57
Text EmbeddingMTEB
Classification Score63.98
50
Triplet AlignmentToxic
Accuracy56.41
33
ClusteringNYTClust
V-Measure39.9
33
Semantic Textual SimilarityPaperCode (PC)
STS36.37
24
Triplet AlignmentAG-News
Triplet Alignment Accuracy (AG-News)78.54
24
Triplet AlignmentIntentEmo
Triplet Alignment Score28.38
24
Showing 10 of 20 rows

Other info

Follow for update