Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Autoregressive Diffusion Models

About

We introduce Autoregressive Diffusion Models (ARDMs), a model class encompassing and generalizing order-agnostic autoregressive models (Uria et al., 2014) and absorbing discrete diffusion (Austin et al., 2021), which we show are special cases of ARDMs under mild assumptions. ARDMs are simple to implement and easy to train. Unlike standard ARMs, they do not require causal masking of model representations, and can be trained using an efficient objective similar to modern probabilistic diffusion models that scales favourably to highly-dimensional data. At test time, ARDMs support parallel generation which can be adapted to fit any given generation budget. We find that ARDMs require significantly fewer steps than discrete diffusion models to attain the same performance. Finally, we apply ARDMs to lossless compression, and show that they are uniquely suited to this task. Contrary to existing approaches based on bits-back coding, ARDMs obtain compelling results not only on complete datasets, but also on compressing single data points. Moreover, this can be done using a modest number of network calls for (de)compression due to the model's adaptable parallel generation.

Emiel Hoogeboom, Alexey A. Gritsenko, Jasmijn Bastings, Ben Poole, Rianne van den Berg, Tim Salimans• 2021

Related benchmarks

TaskDatasetResultRank
Image GenerationCIFAR10 32x32 (test)--
154
Character-level Language Modelingtext8 (test)
BPC1.43
128
Language Modelingtext8 (test)
BPC1.43
21
Generative ModelingCIFAR-10 8-bit color (test)
Bits per Dimension2.64
15
Showing 4 of 4 rows

Other info

Follow for update