Differentiable Reasoning over a Virtual Knowledge Base

About

We consider the task of answering complex multi-hop questions using a corpus as a virtual knowledge base (KB). In particular, we describe a neural module, DrKIT, that traverses textual data like a KB, softly following paths of relations between mentions of entities in the corpus. At each step the module uses a combination of sparse-matrix TFIDF indices and a maximum inner product search (MIPS) on a special index of contextual representations of the mentions. This module is differentiable, so the full system can be trained end-to-end using gradient based methods, starting from natural language inputs. We also describe a pretraining scheme for the contextual representation encoder by generating hard negative examples using existing knowledge bases. We show that DrKIT improves accuracy by 9 points on 3-hop questions in the MetaQA dataset, cutting the gap between text-based and KB-based state-of-the-art by 70%. On HotpotQA, DrKIT leads to a 10% improvement over a BERT-based re-ranking approach to retrieving the relevant passages required to answer a question. DrKIT is also very efficient, processing 10-100x more queries per second than existing multi-hop systems.

Bhuwan Dhingra, Manzil Zaheer, Vidhisha Balachandran, Graham Neubig, Ruslan Salakhutdinov, William W. Cohen• 2020

Related benchmarks

Task	Dataset	Result
Retrieval	HotpotQA	--	68
Multi-hop Question Answering	HotpotQA fullwiki setting (test)	Answer F151.7	64
Answer extraction and supporting sentence prediction	HotpotQA fullwiki (test)	Answer EM42.13	48
Multi-hop Question Answering	HotpotQA (dev)	Answer F151.7	43
Multi-hop Question Answering	HotpotQA fullwiki setting (dev)	Answer F146.6	38
Question Answering	HotpotQA (test)	Ans EM42.1	37
Retrieval	HotpotQA full wiki (dev)	PEM38.3	19
Document Retrieval	HotpotQA (dev)	Recall @ 238.3	13

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord