RankVicuna: Zero-Shot Listwise Document Reranking with Open-Source Large Language Models

About

Researchers have successfully applied large language models (LLMs) such as ChatGPT to reranking in an information retrieval context, but to date, such work has mostly been built on proprietary models hidden behind opaque API endpoints. This approach yields experimental results that are not reproducible and non-deterministic, threatening the veracity of outcomes that build on such shaky foundations. To address this significant shortcoming, we present RankVicuna, the first fully open-source LLM capable of performing high-quality listwise reranking in a zero-shot setting. Experimental results on the TREC 2019 and 2020 Deep Learning Tracks show that we can achieve effectiveness comparable to zero-shot reranking with GPT-3.5 with a much smaller 7B parameter model, although our effectiveness remains slightly behind reranking with GPT-4. We hope our work provides the foundation for future research on reranking with modern LLMs. All the code necessary to reproduce our results is available at https://github.com/castorini/rank_llm.

Ronak Pradeep, Sahel Sharifymoghaddam, Jimmy Lin• 2023

Related benchmarks

Task	Dataset	Result
Document Ranking	TREC DL Track 2019 (test)	nDCG@1068.9	133
Reranking	TREC DL 2020	NDCG@1065.49	132
Reranking	TREC 2020 (test)	NDCG@1066.1	55
Document Reranking	TREC DL 19	NDCG@1066.82	39
Question Answering	Scientific QA Base setting	F1 Score45.27	38
Information Retrieval	Scientific QA Base setting	HitRate@152.53	38
Ranking	BEIR selected subset v1.0.0 (test)	TREC-COVID80.5	38
End-to-end Question Answering	2WikiMultiHopQA (test val)	EM32.66	20
End-to-end Question Answering	HotpotQA (test val)	EM32.08	20
End-to-end Question Answering	MuSiQue (test val)	EM7.78	20

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord