Latent Retrieval for Weakly Supervised Open Domain Question Answering

About

Recent work on open domain question answering (QA) assumes strong supervision of the supporting evidence and/or assumes a blackbox information retrieval (IR) system to retrieve evidence candidates. We argue that both are suboptimal, since gold evidence is not always available, and QA is fundamentally different from IR. We show for the first time that it is possible to jointly learn the retriever and reader from question-answer string pairs and without any IR system. In this setting, evidence retrieval from all of Wikipedia is treated as a latent variable. Since this is impractical to learn from scratch, we pre-train the retriever with an Inverse Cloze Task. We evaluate on open versions of five QA datasets. On datasets where the questioner already knows the answer, a traditional IR system such as BM25 is sufficient. On datasets where a user is genuinely seeking an answer, we show that learned retrieval is crucial, outperforming BM25 by up to 19 points in exact match.

Kenton Lee, Ming-Wei Chang, Kristina Toutanova• 2019

Related benchmarks

Task	Dataset	Result
Open Question Answering	Natural Questions (NQ) (test)	Exact Match (EM)33.3	134
Question Answering	NQ (test)	EM Accuracy33.3	133
Retrieval	MS MARCO (dev)	MRR@100.261	84
Retrieval	TREC DL 2019	NDCG@1061.5	83
Open-domain Question Answering	Natural Questions (NQ)	Exact Match (EM)33.3	82
Open-domain Question Answering	TriviaQA (test)	Exact Match47.1	80
Question Answering	Natural Question (NQ) (dev)	--	72
Passage retrieval	TriviaQA (test)	Top-100 Acc85.5	67
Retrieval	Natural Questions (test)	Top-5 Recall69.8	62
End-to-end Open-Domain Question Answering	NQ (test)	Exact Match (EM)33.3	59

Showing 10 of 35 rows

Other info

Follow for update

@wizwand_team Discord