Sentence Centrality Revisited for Unsupervised Summarization
About
Single document summarization has enjoyed renewed interests in recent years thanks to the popularity of neural network models and the availability of large-scale datasets. In this paper we develop an unsupervised approach arguing that it is unrealistic to expect large-scale and high-quality training data to be available or created for different types of summaries, domains, or languages. We revisit a popular graph-based ranking algorithm and modify how node (aka sentence) centrality is computed in two ways: (a)~we employ BERT, a state-of-the-art neural representation learning model to better capture sentential meaning and (b)~we build graphs with directed edges arguing that the contribution of any two nodes to their respective centrality is influenced by their relative position in a document. Experimental results on three news summarization datasets representative of different languages and writing styles show that our approach outperforms strong baselines by a wide margin.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Summarization | arXiv (test) | ROUGE-138.57 | 161 | |
| Extreme Summarization | SciTLDR 1.0 (test) | ROUGE-128.7 | 20 | |
| Summarization | PubMed 2018 (test) | ROUGE-139.79 | 15 | |
| Scientific Extreme Summarization | SciTLDR (test) | ROUGE-119.3 | 14 | |
| Extractive Summarization | COVIDET-EXT 1.0 (test) | ROUGE-2 (ANGER)30.8 | 9 | |
| Extractive Summarization | COVIDET-EXT | Anger29.7 | 9 | |
| Extractive Summarization | Pubmed | Content Coverage30.52 | 2 |