Improving Passage Retrieval with Zero-Shot Question Generation

About

We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retrieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage. This approach can be applied on top of any retrieval method (e.g. neural or keyword-based), does not require any domain- or task-specific training (and therefore is expected to generalize better to data distribution shifts), and provides rich cross-attention between query and passage (i.e. it must explain every token in the question). When evaluated on a number of open-domain retrieval datasets, our re-ranker improves strong unsupervised retrieval models by 6%-18% absolute and strong supervised models by up to 12% in terms of top-20 passage retrieval accuracy. We also obtain new state-of-the-art results on full open-domain question answering by simply adding the new re-ranker to existing models with no further changes.

Devendra Singh Sachan, Mike Lewis, Mandar Joshi, Armen Aghajanyan, Wen-tau Yih, Joelle Pineau, Luke Zettlemoyer• 2022

Related benchmarks

Task	Dataset	Result
Question Answering	2Wiki	EM9.2	241
Ranking	BEIR selected subset v1.0.0 (test)	TREC-COVID69.25	38
Reranking	BEIR	NQ NDCG@50.3486	35
Reranking	TREC	NDCG@5 (DL19)65.77	35
Passage Ranking	NQ	MRR29.53	29
Passage Ranking	WebQuestions (WQ)	R@1054.8	28
Passage retrieval	Natural Questions (NQ)	Top-10 Accuracy53.51	28
Passage Ranking	TREC DL 2019	R@1083.33	28
Passage Ranking	TREC DL 2020	R@1077.27	28
Document Reranking	BEIR	NDCG@10 (Covid)68.11	24

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord