Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Dense Passage Retrieval for Open-Domain Question Answering

About

Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. When evaluated on a wide range of open-domain QA datasets, our dense retriever outperforms a strong Lucene-BM25 system largely by 9%-19% absolute in terms of top-20 passage retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA benchmarks.

Vladimir Karpukhin, Barlas O\u{g}uz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, Wen-tau Yih• 2020

Related benchmarks

TaskDatasetResultRank
Multi-hop Question AnsweringHotpotQA
F1 Score44.69
221
Open Question AnsweringNatural Questions (NQ) (test)
Exact Match (EM)44.6
134
Document RankingTREC DL Track 2019 (test)
nDCG@1062.2
96
RetrievalMS MARCO (dev)
MRR@100.311
84
Open-domain Question AnsweringTriviaQA (test)
Exact Match60.9
80
Question AnsweringHotpotQA
EM23.13
79
Information RetrievalBEIR (test)
TREC-COVID Score33.2
76
Question AnsweringNatural Questions (test)
EM41.5
72
RetrievalTREC DL 2019
NDCG@1065.3
71
Passage retrievalTriviaQA (test)
Top-100 Acc85
67
Showing 10 of 185 rows
...

Other info

Code

Follow for update