A Multi-Task Embedder For Retrieval Augmented LLMs

About

LLMs confront inherent limitations in terms of its knowledge, memory, and action. The retrieval augmentation stands as a vital mechanism to address these limitations, which brings in useful information from external sources to augment the LLM. However, existing retrieval methods encounter two pressing issues. On one hand, the general retrievers are not properly optimized for retrieval augmentation hence exhibit limited effectiveness; on the other hand, the task-specific retrievers excel in the targeted retrieval augmentation scenario, while lack the versatility to handle diverse scenarios. In this work, we propose \textbf{LLM-Embedder} for the unified support of diverse retrieval augmentation scenarios. Our method presents three technical contributions. Firstly, we introduce a new \textit{reward formulation}, namely {rank-aware reward}. It exploits the ranking position of the desired output among $N$ sampled outputs from the LLM, which leads to fine-grained and robust computation of reward from the LLM's feedback. Secondly, we design a novel \textit{distillation objective}, called graded distillation. It incorporates both the absolute value and the relative order of the reward for more sufficient utilization of the LLM's feedback. Thirdly, we systematically optimize the \textit{multi-task learning}, which effectively unifies the multiple retrieval functionalities into one model. In our experiment, LLM-Embedder notably improves the LLM's performances in various downstream tasks, and outperforms both general and task-specific retrievers with a substantial advantage.

Peitian Zhang, Shitao Xiao, Zheng Liu, Zhicheng Dou, Jian-Yun Nie• 2023

Related benchmarks

Task	Dataset	Result
Multi-hop Question Answering	2WikiMultihopQA	EM45.72	559
Multi-hop Question Answering	MuSiQue	EM18.36	209
Multi-hop Question Answering	HotpotQA	Exact Match (EM)41.39	150
Multi-hop Question Answering	Bamboogle	Exact Match40.8	128
General Question Answering	TriviaQA	Exact Match62.33	54
General Question Answering	NQ	Exact Match (EM)41.32	52
General Question Answering	PopQA	EM42.69	51
Conversational Search	CAsT 19	MRR63.3	24
Conversational Search	CAsT 20	MRR25.2	24
Question Answering	Combined 7 Datasets	Average Score39.82	18

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord