Neural Summarization by Extracting Sentences and Words

About

Traditional approaches to extractive summarization rely heavily on human-engineered features. In this work we propose a data-driven approach based on neural networks and continuous sentence features. We develop a general framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor. This architecture allows us to develop different classes of summarization models which can extract sentences or words. We train our models on large scale corpora containing hundreds of thousands of document-summary pairs. Experimental results on two summarization datasets demonstrate that our models obtain results comparable to the state of the art without any access to linguistic annotation.

Jianpeng Cheng, Mirella Lapata• 2016

Related benchmarks

Task	Dataset	Result
Summarization	arXiv (test)	ROUGE-142.24	161
Summarization	CNN/Daily Mail original, non-anonymized (test)	ROUGE-141.13	54
Abstractive Summarization	CNN/DailyMail full length F-1 (test)	ROUGE-140.11	48
Extractive Summarization	CNN/Daily Mail (test)	ROUGE-142.2	36
Summarization	CNNDM full-length F1 (test)	ROUGE-140.11	19
Summarization	DUC 2002 (test)	ROUGE-147.4	18
Summarization	PubMed 2018 (test)	ROUGE-143.89	15
Multimodal Summarization	Daily Mail	ROUGE-141.22	10
Summarization	CNN+DailyMail mixed (test)	ROUGE-135.5	9
Extractive Summarization	DailyMail 75 bytes (test)	ROUGE-122.7	7

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord