A Variational Hierarchical Model for Neural Cross-Lingual Summarization

About

The goal of the cross-lingual summarization (CLS) is to convert a document in one language (e.g., English) to a summary in another one (e.g., Chinese). Essentially, the CLS task is the combination of machine translation (MT) and monolingual summarization (MS), and thus there exists the hierarchical relationship between MT\&MS and CLS. Existing studies on CLS mainly focus on utilizing pipeline methods or jointly training an end-to-end model through an auxiliary MT or MS objective. However, it is very challenging for the model to directly conduct CLS as it requires both the abilities to translate and summarize. To address this issue, we propose a hierarchical model for the CLS task, based on the conditional variational auto-encoder. The hierarchical model contains two kinds of latent variables at the local and global levels, respectively. At the local level, there are two latent variables, one for translation and the other for summarization. As for the global level, there is another latent variable for cross-lingual summarization conditioned on the two local-level variables. Experiments on two language directions (English-Chinese) verify the effectiveness and superiority of the proposed approach. In addition, we show that our model is able to generate better cross-lingual summaries than comparison models in the few-shot setting.

Yunlong Liang, Fandong Meng, Chulun Zhou, Jinan Xu, Yufeng Chen, Jinsong Su, Jie Zhou• 2022

Related benchmarks

Task	Dataset	Result
Cross-lingual Summarization	En2ZhSum (test)	ROUGE-141.95	31
Cross-lingual Summarization	Zh2EnSum (test)	ROUGE-143.97	27
Cross-lingual Summarization	Zh2EnSum 0.1% few-shot (test)	Informativeness (IF)2.68	5
Cross-lingual Summarization	En2ZhSum 0.1% few-shot (test)	Informative Score2.56	5

Showing 4 of 4 rows

Other info

Code

Follow for update

@wizwand_team Discord