Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Fine-Tuning LLaMA for Multi-Stage Text Retrieval

About

The effectiveness of multi-stage text retrieval has been solidly demonstrated since before the era of pre-trained language models. However, most existing studies utilize models that predate recent advances in large language models (LLMs). This study seeks to explore potential improvements that state-of-the-art LLMs can bring. We conduct a comprehensive study, fine-tuning the latest LLaMA model both as a dense retriever (RepLLaMA) and as a pointwise reranker (RankLLaMA) for both passage retrieval and document retrieval using the MS MARCO datasets. Our findings demonstrate that the effectiveness of large language models indeed surpasses that of smaller models. Additionally, since LLMs can inherently handle longer contexts, they can represent entire documents holistically, obviating the need for traditional segmenting and pooling strategies. Furthermore, evaluations on BEIR demonstrate that our RepLLaMA-RankLLaMA pipeline exhibits strong zero-shot effectiveness. Model checkpoints from this study are available on HuggingFace.

Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, Jimmy Lin• 2023

Related benchmarks

TaskDatasetResultRank
Multi-hop Question Answering2WikiMultihopQA
EM32.24
278
Multi-hop Question AnsweringHotpotQA
F1 Score32.95
221
Question Answering2Wiki
F124.1
75
Multi-hop Question AnsweringMulti-hop RAG--
65
Information RetrievalBEIR
TREC-COVID0.847
59
Information RetrievalBEIR v1.0.0 (test)
ArguAna48.6
55
Conversational RetrievalQReCC (test)
Recall@1020.4
43
Conversational RetrievalTopiOCQA (test)
NDCG@30.15
26
Document RetrievalMS MARCO Document (dev)
MRR@1000.456
24
Passage RankingTREC DL 2019
NDCG@100.743
24
Showing 10 of 41 rows

Other info

Follow for update