Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

D2LLM: Decomposed and Distilled Large Language Models for Semantic Search

About

The key challenge in semantic search is to create models that are both accurate and efficient in pinpointing relevant sentences for queries. While BERT-style bi-encoders excel in efficiency with pre-computed embeddings, they often miss subtle nuances in search tasks. Conversely, GPT-style LLMs with cross-encoder designs capture these nuances but are computationally intensive, hindering real-time applications. In this paper, we present D2LLMs-Decomposed and Distilled LLMs for semantic search-that combines the best of both worlds. We decompose a cross-encoder into an efficient bi-encoder integrated with Pooling by Multihead Attention and an Interaction Emulation Module, achieving nuanced understanding and pre-computability. Knowledge from the LLM is distilled into this model using contrastive, rank, and feature imitation techniques. Our experiments show that D2LLM surpasses five leading baselines in terms of all metrics across three tasks, particularly improving NLI task performance by at least 6.45%. The source code is available at https://github.com/codefuse-ai/D2LLM.

Zihan Liao, Hang Yu, Jianguo Li, Jun Wang, Wei Zhang• 2024

Related benchmarks

TaskDatasetResultRank
Semantic Textual SimilaritySTS-B
Spearman's Rho (x100)77.29
70
Semantic Textual SimilarityLCQMC
Pearson Correlation0.6233
11
Semantic Textual SimilarityPAWSX
Pearson Correlation0.3038
11
Semantic Textual SimilarityATEC
Pearson Correlation0.4603
11
Semantic Textual SimilarityBQ
Pearson Correlation0.5589
11
Semantic Textual SimilarityAFQMC
Pearson Correlation0.3891
11
Semantic Textual SimilarityQBQTC
Pearson Correlation0.2756
11
NLIOCNLI (test)
Accuracy0.7889
9
NLICMNLI (test)
Acc80.14
9
Information RetrievalT2Retrieval
MRR0.8893
8
Showing 10 of 15 rows

Other info

Code

Follow for update