Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions

About

Generative flows and diffusion models have been predominantly trained on ordinal data, for example natural images. This paper introduces two extensions of flows and diffusion for categorical data such as language or image segmentation: Argmax Flows and Multinomial Diffusion. Argmax Flows are defined by a composition of a continuous distribution (such as a normalizing flow), and an argmax function. To optimize this model, we learn a probabilistic inverse for the argmax that lifts the categorical data to a continuous space. Multinomial Diffusion gradually adds categorical noise in a diffusion process, for which the generative denoising process is learned. We demonstrate that our method outperforms existing dequantization approaches on text modelling and modelling on image segmentation maps in log-likelihood.

Emiel Hoogeboom, Didrik Nielsen, Priyank Jaini, Patrick Forr\'e, Max Welling• 2021

Related benchmarks

TaskDatasetResultRank
Character-level Language Modelingenwik8 (test)--
195
Character-level Language Modelingtext8 (test)
BPC1.39
128
Machine TranslationIWSLT
BLEU21.28
31
Language Modelingtext8
BPC1.39
23
ParaphrasingQQP
BLEU20.7
22
Language Modelingtext8 (test)
BPC1.23
21
Question GenerationQT
BLEU16.96
14
Neural Machine TranslationWMT16
BLEU25.25
14
Machine TranslationWMT14
BLEU6.94
8
Segmentation Map GenerationCityscapes (test)
ELBO0.365
7
Showing 10 of 13 rows

Other info

Follow for update