Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space

About

When trained effectively, the Variational Autoencoder (VAE) can be both a powerful generative model and an effective representation learning framework for natural language. In this paper, we propose the first large-scale language VAE model, Optimus. A universal latent embedding space for sentences is first pre-trained on large text corpus, and then fine-tuned for various language generation and understanding tasks. Compared with GPT-2, Optimus enables guided language generation from an abstract level using the latent vectors. Compared with BERT, Optimus can generalize better on low-resource language understanding tasks due to the smooth latent space structure. Extensive experimental results on a wide range of language tasks demonstrate the effectiveness of Optimus. It achieves new state-of-the-art on VAE language modeling benchmarks. We hope that our first pre-trained big VAE language model itself and results can help the NLP community renew the interests of deep generative models in the era of large-scale pre-training, and make these principled methods more practical.

Chunyuan Li, Xiang Gao, Yuan Li, Baolin Peng, Xiujun Li, Yizhe Zhang, Jianfeng Gao• 2020

Related benchmarks

Task	Dataset	Result
Language Modeling	Penn Treebank (PTB) (test)	Perplexity23.58	130
Language Modeling	Yahoo (test)	--	48
Language Modeling	Yelp (test)	PPL21.99	35
Mathematical Reasoning	Mathematics out-of-domain (test)	Accuracy2	30
Conclusion Generation	EntailmentBank (test)	BLEU26	26
Sentence Interpolation Smoothness	ARGO randomly sampled 200 sentence pairs	Average IS0.259	22
Autoencoding	Mathematical expressions EVAL (test)	BLEU96	22
Language modelling	Mathematical expression EVAL (test)	Exact Match99	19
Language modelling	Explanatory sentences	BLEU35	19
Disentanglement	ARG0	Accuracy97.2	18

Showing 10 of 36 rows

Other info

Follow for update

@wizwand_team Discord