PromptEmbedder: Efficient and Transferable Text Embedding via Dual-LLM Soft Prompting

About

Large Language Models (LLMs) have demonstrated remarkable efficacy in text embedding, yet current adaptation methods like LoRA face significant bottlenecks in computational efficiency and cross-architecture transferability. Whenever a new backbone emerges, existing approaches require costly retraining from scratch. To address this, we propose PromptEmbedder, a novel dual-LLM framework that decouples embedding knowledge from specific backbone weights. PromptEmbedder utilizes a Prompting LLM to generate instruction-aware soft prompts for a frozen Embedding LLM via a differentiable generation process with continuous relaxation, ensuring full gradient flow during contrastive training. By localizing task-specific knowledge within the Prompting LLM, adapting to new architectures requires only retraining a lightweight linear alignment matrix. Evaluations on the MTEB benchmark show that PromptEmbedder achieves comparable performance with LoRA finetuning while reducing GPU memory by 40% and accelerating training by 3.7x. Our approach establishes a scalable, architecture-agnostic paradigm for efficient LLM-based representation learning.

Yu-Che Tsai, Kuan-Yu Chen, Yuan-Hao Chen, Yu-Han Chang, Ching-Yu Tsai, Yu-Hsiang Chuang, Shou-De Lin• 2026

Related benchmarks

Task	Dataset	Result
Text Embedding	MTEB English v2	Mean Score65.23	113
Triplet Alignment	Toxic	Accuracy59.58	33
Clustering	NYTClust	V-Measure42.12	33
Triplet Alignment	IntentEmo	Triplet Alignment Score65.01	24
Semantic Textual Similarity	Big Patent (BP)	STS Score26.92	24
Triplet Alignment	AG-News	Triplet Alignment Accuracy (AG-News)85.36	24
Semantic Textual Similarity	PaperCode (PC)	STS38.44	24
Semantic Textual Similarity	MultiHate (MH)	STS Score15.46	24

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord