A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss
About
We propose a unified model combining the strength of extractive and abstractive summarization. On the one hand, a simple extractive model can obtain sentence-level attention with high ROUGE scores but less readable. On the other hand, a more complicated abstractive model can obtain word-level dynamic attention to generate a more readable paragraph. In our model, sentence-level attention is used to modulate the word-level attention such that words in less attended sentences are less likely to be generated. Moreover, a novel inconsistency loss function is introduced to penalize the inconsistency between two levels of attentions. By end-to-end training our model with the inconsistency loss and original losses of extractive and abstractive models, we achieve state-of-the-art ROUGE scores while being the most informative and readable summarization on the CNN/Daily Mail dataset in a solid human evaluation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Abstractive Text Summarization | CNN/Daily Mail (test) | ROUGE-L37.13 | 169 | |
| Text Summarization | CNN/Daily Mail (test) | ROUGE-217.97 | 65 | |
| Summarization | CNN/Daily Mail original, non-anonymized (test) | ROUGE-140.68 | 54 | |
| Abstractive Summarization | CNN/Daily Mail non-anonymous (test) | ROUGE-140.68 | 52 | |
| Email Subject Line Generation | AESLC (dev) | ROUGE-122.98 | 21 | |
| Email Subject Line Generation | AESLC (test) | ROUGE-122.8 | 21 | |
| Summarization | CNNDM full-length F1 (test) | ROUGE-140.88 | 19 | |
| Summarization | CNN/Daily Mail full length (test) | ROUGE-140.68 | 18 | |
| Email Subject Generation | AESLC (test) | ESQE1.46 | 11 |