Embracing data abundance: BookTest Dataset for Reading Comprehension

About

There is a practically unlimited amount of natural language data available. Still, recent work in text comprehension has focused on datasets which are small relative to current computing possibilities. This article is making a case for the community to move to larger data and as a step in that direction it is proposing the BookTest, a new dataset similar to the popular Children's Book Test (CBT), however more than 60 times larger. We show that training on the new data improves the accuracy of our Attention-Sum Reader model on the original CBT test data by a much larger margin than many recent attempts to improve the model architecture. On one version of the dataset our ensemble even exceeds the human baseline provided by Facebook. We then show in our own human study that there is still space for further improvement.

Ondrej Bajgar, Rudolf Kadlec, Jan Kleindienst• 2016

Related benchmarks

Task	Dataset	Result
Machine Comprehension	CNN (val)	Accuracy0.739	80
Machine Comprehension	CNN (test)	Accuracy75.4	77
Machine Comprehension	CBT-CN (test)	Accuracy83.7	56
Machine Comprehension	CBT NE (test)	Accuracy78.4	56
Machine Reading Comprehension	Daily Mail (test)	Accuracy77.7	46
Machine Comprehension	CBT-NE (val)	Accuracy82.3	37
Machine Comprehension	CBT-CN (val)	Accuracy85.7	37
Machine Reading Comprehension	Daily Mail (val)	Accuracy78.7	36
Cloze-style Question Answering	WDW Strict 1.0 (test)	Accuracy57	10
Cloze-style Question Answering	WDW Relaxed 1.0 (test)	Accuracy59	9

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord