Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dense Passage Retrieval for Open-Domain Question Answering

About

Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. When evaluated on a wide range of open-domain QA datasets, our dense retriever outperforms a strong Lucene-BM25 system largely by 9%-19% absolute in terms of top-20 passage retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA benchmarks.

Vladimir Karpukhin, Barlas O\u{g}uz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, Wen-tau Yih• 2020

Related benchmarks

TaskDatasetResultRank
Multi-hop Question AnsweringHotpotQA
F1 Score44.69
294
Open Question AnsweringNatural Questions (NQ) (test)
Exact Match (EM)44.6
134
Document RankingTREC DL Track 2019 (test)
nDCG@1062.2
133
Question AnsweringHotpotQA
EM23.13
109
Information RetrievalBEIR (test)
FiQA-2018 Score34.2
90
Question AnsweringNQ (test)
EM Accuracy36.09
86
RetrievalMS MARCO (dev)
MRR@100.311
84
RetrievalTREC DL 2019
NDCG@1065.3
83
Open-domain Question AnsweringTriviaQA (test)
Exact Match60.9
80
Open-domain Question AnsweringNatural Questions (NQ)
Exact Match (EM)41.5
74
Showing 10 of 239 rows
...

Other info

Code

Follow for update