Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

UniSumm and SummZoo: Unified Model and Diverse Benchmark for Few-Shot Summarization

About

The high annotation costs and diverse demands of various summarization tasks motivate the development of few-shot summarization. However, despite the emergence of many summarization tasks and datasets, the current training paradigm for few-shot summarization systems ignores potentially shareable knowledge in heterogeneous datasets. To this end, we propose \textsc{UniSumm}, a unified few-shot summarization model pre-trained with multiple summarization tasks and can be prefix-tuned to excel at any few-shot summarization task. Meanwhile, to better evaluate few-shot summarizers, under the principles of diversity and robustness, we assemble and release a new benchmark \textsc{SummZoo}. It consists of $8$ summarization tasks with multiple sets of few-shot samples for each task, covering diverse domains. Experimental results and analysis show that \textsc{UniSumm} outperforms strong baselines by a large margin across all sub-tasks in \textsc{SummZoo} under both automatic and human evaluations and achieves comparable results in human evaluation compared with a GPT-3.5 model.

Yulong Chen, Yang Liu, Ruochen Xu, Ziyi Yang, Chenguang Zhu, Michael Zeng, Yue Zhang• 2022

Related benchmarks

TaskDatasetResultRank
SummarizationXsum
ROUGE-211.36
108
SummarizationarXiv
ROUGE-216.42
76
Abstractive SummarizationSamSum
ROUGE-220.65
73
Abstractive SummarizationMulti-News
ROUGE-215.86
47
SummarizationSamSum--
30
Abstractive SummarizationWikiHow
ROUGE-211.73
26
SummarizationMultiNews (test)--
24
SummarizationDIALOGSUM
ROUGE-215.64
17
SummarizationWikiHow (test)--
12
SummarizationSUMMZOO Average
ROUGE-213.97
11
Showing 10 of 17 rows

Other info

Code

Follow for update