Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index

About

Existing open-domain question answering (QA) models are not suitable for real-time usage because they need to process several long documents on-demand for every input query. In this paper, we introduce the query-agnostic indexable representation of document phrases that can drastically speed up open-domain QA and also allows us to reach long-tail targets. In particular, our dense-sparse phrase encoding effectively captures syntactic, semantic, and lexical information of the phrases and eliminates the pipeline filtering of context documents. Leveraging optimization strategies, our model can be trained in a single 4-GPU server and serve entire Wikipedia (up to 60 billion phrases) under 2TB with CPUs only. Our experiments on SQuAD-Open show that our model is more accurate than DrQA (Chen et al., 2017) with 6000x reduced computational cost, which translates into at least 58x faster end-to-end inference benchmark on CPUs.

Minjoon Seo, Jinhyuk Lee, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi• 2019

Related benchmarks

TaskDatasetResultRank
Question AnsweringSQuAD v1.1 (test)
F1 Score81.7
260
Open-domain Question AnsweringSQUAD Open (test)
Exact Match36.2
39
Open-domain Question AnsweringSQuAD Open-domain 1.1 (test)
Exact Match (EM)36.2
30
Reading ComprehensionSQuAD (dev)
F1 Score0.817
15
Open-domain Question AnsweringNatural Questions (NQ) (test)
Accuracy8.1
14
Open-domain Question AnsweringSQuAD (test)
Accuracy36.2
7
Reading ComprehensionNatural Questions (NQ) Long (dev)
EM68.2
4
Showing 7 of 7 rows

Other info

Follow for update