SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval

About

We introduce SPARTA, a novel neural retrieval method that shows great promise in performance, generalization, and interpretability for open-domain question answering. Unlike many neural ranking methods that use dense vector nearest neighbor search, SPARTA learns a sparse representation that can be efficiently implemented as an Inverted Index. The resulting representation enables scalable neural retrieval that does not require expensive approximate vector search and leads to better performance than its dense counterpart. We validated our approaches on 4 open-domain question answering (OpenQA) tasks and 11 retrieval question answering (ReQA) tasks. SPARTA achieves new state-of-the-art results across a variety of open-domain question answering tasks in both English and Chinese datasets, including open SQuAD, Natuarl Question, CMRC and etc. Analysis also confirms that the proposed method creates human interpretable representation and allows flexible control over the trade-off between performance and efficiency.

Tiancheng Zhao, Xiaopeng Lu, Kyusong Lee• 2020

Related benchmarks

Task	Dataset	Result
Information Retrieval	BEIR	--	174
Open-domain Question Answering	NaturalQ-Open (test)	EM37.5	37
Open-domain Question Answering	SQuAD Open-domain 1.1 (test)	Exact Match (EM)59.3	30
Question Answering	SQuAD-Open	EM59.3	28
Retrieval Question Answering	SQuAD	MRR79	14
Retrieval Question Answering	News in-domain	MRR46.6	10
Open-domain Question Answering	Natural Questions Open (dev)	EM36.8	9
Retrieval Question Answering	Trivia	MRR47.6	6
Retrieval Question Answering	NQ	MRR75.8	6
Retrieval Question Answering	HotPot	MRR47.7	6

Showing 10 of 22 rows

Other info

Follow for update

@wizwand_team Discord