Repetition Improves Language Model Embeddings

About

Bidirectional models are considered essential for strong text embeddings. Recent approaches to adapt autoregressive language models (LMs) into strong text embedding models have largely had the requirement to modify the LM architecture to be bidirectional. We challenge this premise by introducing "echo embeddings" which converts autoregressive LMs into high quality text embedding models without changing the architecture or requiring fine-tuning. By repeating the input and extracting embeddings from the repeated tokens -- which have access to all original tokens -- echo embeddings improve over classical LM embeddings by over 5% in zero-shot settings. Our zero-shot embeddings nearly match those obtained by bidirectionally-converted LMs that undergo additional masked-language modeling training. Echo embeddings are also compatible with supervised fine-tuning, matching or outperforming bidirectionally-converted LMs in an apples-to-apples comparison, even with an identical compute budget during training and inference. Overall, repetition is a simple and effective strategy to circumvent the need for bidirectional attention in embedding models, paving the way towards a unified architecture for all NLP tasks.

Jacob Mitchell Springer, Suhas Kotha, Daniel Fried, Graham Neubig, Aditi Raghunathan• 2024

Related benchmarks

Task	Dataset	Result
Semantic Textual Similarity	STS tasks (STS12, STS13, STS14, STS15, STS16, STS-B, SICK-R) various (test)	STS12 Score59.36	412
Semantic Textual Similarity	STS tasks (STS12, STS13, STS14, STS15, STS16, STS-B, SICK-R)	STS12 Score52.4	253
Text Embedding	MTEB English v2	Mean Score45.74	107
Semantic Textual Similarity	STS (Semantic Textual Similarity) 2012-2016 (test)	STS-12 Score50.43	57
Text Embedding	MTEB	Classification Score63.98	50
Triplet Alignment	Toxic	Accuracy56.41	33
Clustering	NYTClust	V-Measure39.9	33
Semantic Textual Similarity	PaperCode (PC)	STS36.37	24
Triplet Alignment	AG-News	Triplet Alignment Accuracy (AG-News)78.54	24
Triplet Alignment	IntentEmo	Triplet Alignment Score28.38	24

Showing 10 of 20 rows

Other info

Follow for update

@wizwand_team Discord