Conversational Question Answering on Heterogeneous Sources

About

Conversational question answering (ConvQA) tackles sequential information needs where contexts in follow-up questions are left implicit. Current ConvQA systems operate over homogeneous sources of information: either a knowledge base (KB), or a text corpus, or a collection of tables. This paper addresses the novel issue of jointly tapping into all of these together, this way boosting answer coverage and confidence. We present CONVINSE, an end-to-end pipeline for ConvQA over heterogeneous sources, operating in three stages: i) learning an explicit structured representation of an incoming question and its conversational context, ii) harnessing this frame-like representation to uniformly capture relevant evidences from KB, text, and tables, and iii) running a fusion-in-decoder model to generate the answer. We construct and release the first benchmark, ConvMix, for ConvQA over heterogeneous sources, comprising 3000 real-user conversations with 16000 questions, along with entity annotations, completed question utterances, and question paraphrases. Experiments demonstrate the viability and advantages of our method, compared to state-of-the-art baselines.

Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum• 2022

Related benchmarks

Task	Dataset	Result
Conversational Question Answering	ConvMix 1.0 (test)	P@1 (All)34.2	21
Temporal Question Answering	TIME QUESTIONS 1.0 (test)	P@142.3	18
Open-domain Question Answering	COMPMIX (test)	Exact Match40.7	9
Question Answering	COMPMIX (test)	Precision@10.407	8
Conversational Question Answering	ConvMix-5T 1.0 (test)	P@132.1	7
Question Answering	CRAG (test)	P@129.8	6
Conversational Question Answering	ConvMix 9 (test)	P@10.343	5
Conversational Question Answering	CONVMIX (test)	P@127.9	5

Showing 8 of 8 rows

Other info

Code

Follow for update

@wizwand_team Discord