Learning Recurrent Span Representations for Extractive Question Answering

About

The reading comprehension task, that asks questions about a given evidence document, is a central problem in natural language understanding. Recent formulations of this task have typically focused on answer selection from a set of candidates pre-defined manually or through the use of an external NLP pipeline. However, Rajpurkar et al. (2016) recently released the SQuAD dataset in which the answers can be arbitrary strings from the supplied text. In this paper, we focus on this answer extraction task, presenting a novel model architecture that efficiently builds fixed length representations of all spans in the evidence document with a recurrent network. We show that scoring explicit span representations significantly improves performance over other approaches that factor the prediction into separate predictions about words or start and end markers. Our approach improves upon the best published results of Wang & Jiang (2016) by 5% and decreases the error of Rajpurkar et al.'s baseline by > 50%.

Kenton Lee, Shimi Salant, Tom Kwiatkowski, Ankur Parikh, Dipanjan Das, Jonathan Berant• 2016

Related benchmarks

Task	Dataset	Result
Question Answering	SQuAD v1.1 (test)	F1 Score78.7	260
Question Answering	SQuAD (test)	F175.5	156
Question Answering	SQuAD (dev)	F174.9	74
Question Answering	SQuAD hidden 1.1 (test)	EM70.8	18
Question Answering	adversarial SQuAD (test)	Add Sent Score39.5	12
Reading Comprehension	Adversarial SQuAD AddSent v1.1 (test)	F139.5	10
Reading Comprehension	Adversarial SQuAD AddOneSent v1.1 (test)	F1 Score49.5	10
Question Answering	SQuAD-Adversarial AddSent 1.1 (dev)	F1 Score39.5	9
Question Answering	SQuAD-Adversarial AddOneSent 1.1 (dev)	F149.5	9

Showing 9 of 9 rows

Other info

Code

Follow for update

@wizwand_team Discord