Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Learning Passage Impacts for Inverted Indexes

About

Neural information retrieval systems typically use a cascading pipeline, in which a first-stage model retrieves a candidate set of documents and one or more subsequent stages re-rank this set using contextualized language models such as BERT. In this paper, we propose DeepImpact, a new document term-weighting scheme suitable for efficient retrieval using a standard inverted index. Compared to existing methods, DeepImpact improves impact-score modeling and tackles the vocabulary-mismatch problem. In particular, DeepImpact leverages DocT5Query to enrich the document collection and, using a contextualized language model, directly estimates the semantic importance of tokens in a document, producing a single-value representation for each token in each document. Our experiments show that DeepImpact significantly outperforms prior first-stage retrieval approaches by up to 17% on effectiveness metrics w.r.t. DocT5Query, and, when deployed in a re-ranking scenario, can reach the same effectiveness of state-of-the-art approaches with up to 5.1x speedup in efficiency.

Antonio Mallia, Omar Khattab, Nicola Tonellotto, Torsten Suel• 2021

Related benchmarks

TaskDatasetResultRank
Passage retrievalMsMARCO (dev)
MRR@1032.6
116
Passage RankingMS MARCO (dev)
MRR@1032.8
73
RetrievalTREC DL 2019
NDCG@1069.5
71
Passage RankingTREC DL 2019 (test)
NDCG@1069.5
33
Passage RankingTREC DL 2020 (test)
NDCG@100.628
15
RetrievalMS-MARCO v1 (test)
L_AMD24.5
7
Showing 6 of 6 rows

Other info

Follow for update