Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Unsupervised Summarization Re-ranking

About

With the rise of task-specific pre-training objectives, abstractive summarization models like PEGASUS offer appealing zero-shot performance on downstream summarization tasks. However, the performance of such unsupervised models still lags significantly behind their supervised counterparts. Similarly to the supervised setup, we notice a very high variance in quality among summary candidates from these models while only one candidate is kept as the summary output. In this paper, we propose to re-rank summary candidates in an unsupervised manner, aiming to close the performance gap between unsupervised and supervised models. Our approach improves the unsupervised PEGASUS by up to 7.27% and ChatGPT by up to 6.86% relative mean ROUGE across four widely-adopted summarization benchmarks ; and achieves relative gains of 7.51% (up to 23.73% from XSum to WikiHow) averaged over 30 zero-shot transfer setups (finetuning on a dataset, evaluating on another).

Mathieu Ravaut, Shafiq Joty, Nancy Chen• 2022

Related benchmarks

TaskDatasetResultRank
Abstractive SummarizationXSum (test)
ROUGE-L14.93
44
Abstractive SummarizationWikiHow
ROUGE-27.26
26
Abstractive SummarizationXsum
ROUGE-127.98
18
Abstractive SummarizationCNN/DM
ROUGE-142.05
14
Unsupervised abstractive summarizationCNN-DM (test)
ROUGE-139.76
12
SummarizationCNN/DM human evaluation
Informational Content Score24
4
Unsupervised abstractive summarizationWikiHow (test)
ROUGE-10.265
4
Unsupervised abstractive summarizationSamSum (test)
ROUGE-128.91
4
Abstractive SummarizationSamSum (test)
factCC96.28
2
Showing 9 of 9 rows

Other info

Code

Follow for update