Latent Diffusion for Language Generation

About

Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to existing pretrained language models. We view diffusion and existing language models as complementary. We demonstrate that encoder-decoder language models can be utilized to efficiently learn high-quality language autoencoders. We then demonstrate that continuous diffusion models can be learned in the latent space of the language autoencoder, enabling us to sample continuous latent representations that can be decoded into natural language with the pretrained decoder. We validate the effectiveness of our approach for unconditional, class-conditional, and sequence-to-sequence language generation. We demonstrate across multiple diverse data sets that our latent language diffusion models are significantly more effective than previous diffusion language models.

Justin Lovelace, Varsha Kishore, Chao Wan, Eliot Shekhtman, Kilian Q. Weinberger• 2022

Related benchmarks

Task	Dataset	Result
Summarization	XSum (test)	ROUGE-220	276
Machine Translation	WMT En-De '14	BLEU22.4	89
Machine Translation	WMT De-En 14	BLEU27	40
Unconditional Generation	LM1B	Generation Perplexity167.5	31
Seq2Seq	QQP	ROUGE-L66	18
Class-Conditional Language Generation	AG-News	MAUVE (World)0.842	16
Text Generation	QQP	BS84.7	12
Text Generation	Quasar-T	BS63.1	11
Text Generation	ROCStories (test)	MAUVE0.716	11
Unconditional Language Generation	ROCStories (test)	MAUVE0.716	9

Showing 10 of 12 rows

Other info

Code

Follow for update

@wizwand_team Discord