Fine-tune BERT for Extractive Summarization
About
BERT, a pre-trained Transformer model, has achieved ground-breaking performance on multiple NLP tasks. In this paper, we describe BERTSUM, a simple variant of BERT, for extractive summarization. Our system is the state of the art on the CNN/Dailymail dataset, outperforming the previous best-performed system by 1.65 on ROUGE-L. The codes to reproduce our results are available at https://github.com/nlpyang/BertSum
Yang Liu• 2019
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Summarization | CNN Daily Mail | ROUGE-143.25 | 67 | |
| Extractive Summarization | CNN/Daily Mail (test) | ROUGE-143.25 | 36 | |
| Extractive Summarization | NYT50 (test) | ROUGE-146.66 | 21 | |
| Summarization | CNN/Daily Mail full length (test) | ROUGE-143.25 | 18 | |
| Extractive Summarization | CNN-DM (test) | ROUGE-143.23 | 18 | |
| Summarization | NYT50 limited length (test) | ROUGE-146.66 | 8 | |
| Summarization | CNN/Daily Mail (test) | Relevance58 | 8 | |
| Headline Generation | PANCO (test) | R1 (ROUGE-1)28.09 | 7 | |
| Extractive Summarization | SCILIT (test) | ROUGE-142.53 | 2 |
Showing 9 of 9 rows