Zero-Shot Listwise Document Reranking with a Large Language Model
About
Supervised ranking methods based on bi-encoder or cross-encoder architectures have shown success in multi-stage text ranking tasks, but they require large amounts of relevance judgments as training data. In this work, we propose Listwise Reranker with a Large Language Model (LRL), which achieves strong reranking effectiveness without using any task-specific training data. Different from the existing pointwise ranking methods, where documents are scored independently and ranked according to the scores, LRL directly generates a reordered list of document identifiers given the candidate documents. Experiments on three TREC web search datasets demonstrate that LRL not only outperforms zero-shot pointwise methods when reranking first-stage retrieval results, but can also act as a final-stage reranker to improve the top-ranked results of a pointwise method for improved efficiency. Additionally, we apply our approach to subsets of MIRACL, a recent multilingual retrieval dataset, with results showing its potential to generalize across different languages.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Document Ranking | TREC DL Track 2019 (test) | nDCG@1075.6 | 96 | |
| Reranking | TREC 2020 (test) | NDCG@1070.6 | 55 | |
| Nugget Coverage Reranking | NeuCLIR ReportGen 2024 (test) | nDCG92.3 | 18 | |
| Nugget Coverage Reranking | CRUX-MDS DUC 2004 (test) | nDCG81.7 | 18 | |
| Passage Reranking | BEIR (test) | Covid76.7 | 11 |