Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization
About
We introduce extreme summarization, a new single-document summarization task which does not favor extractive strategies and calls for an abstractive modeling approach. The idea is to create a short, one-sentence news summary answering the question "What is the article about?". We collect a real-world, large-scale dataset for this task by harvesting online articles from the British Broadcasting Corporation (BBC). We propose a novel abstractive model which is conditioned on the article's topics and based entirely on convolutional neural networks. We demonstrate experimentally that this architecture captures long-range dependencies in a document and recognizes pertinent content, outperforming an oracle extractive system and state-of-the-art abstractive approaches when evaluated automatically and by humans.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Summarization | XSum (test) | ROUGE-211.54 | 231 | |
| Email Subject Line Generation | AESLC (dev) | ROUGE-113.52 | 21 | |
| Email Subject Line Generation | AESLC (test) | ROUGE-112.6 | 21 | |
| Summarization | EMAILSUM short 1.0 (test) | R136.14 | 19 | |
| Summarization | EMAILSUM long 1.0 (test) | ROUGE-1 (R1)43.48 | 19 | |
| Email Subject Generation | AESLC (test) | ESQE1.54 | 11 | |
| Document-level Claim Extraction | AVeriTeC-DCE (test) | chrF23.8 | 11 | |
| Summarization | OrangeSum Abstract | ROUGE-138.36 | 7 | |
| Summarization | OrangeSum Title | ROUGE-1 Score31.62 | 7 | |
| Document-level Claim Extraction (Sentence) | AVeriTeC-DCE 1.0 (test) | SARI6.41 | 6 |