Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TEncDM: Understanding the Properties of the Diffusion Model in the Space of Language Model Encodings

About

This paper presents the Text Encoding Diffusion Model (TEncDM), a novel approach to diffusion modeling that operates in the space of pre-trained language model encodings. In contrast to traditionally used embeddings, encodings integrate contextual information. In our approach, we also employ a transformer-based decoder, specifically designed to incorporate context in the token prediction process. We conduct a comprehensive examination of the influence of the encoder, decoder, noise scheduler, and self-conditioning on zero-shot generation. Furthermore, we compare TEncDM with previous approaches on three conditional text generation tasks: QQP, XSum, and Wiki-Auto. The results show that TEncDM exhibits superior performance compared to existing non-autoregressive diffusion models. Our code is available at https://github.com/M0RJIQUE/tencdm.

Alexander Shabalin, Viacheslav Meshchaninov, Egor Chimbulatov, Vladislav Lapikov, Roman Kim, Grigory Bartosh, Dmitry Molchanov, Sergey Markov, Dmitry Vetrov• 2024

Related benchmarks

TaskDatasetResultRank
ParaphrasingQQP
BLEU33
22
Text SimplificationWikiAuto
BLEU41.6
14
Question GenerationQT
BLEU11.1
14
Showing 3 of 3 rows

Other info

Follow for update