Latent Normalizing Flows for Discrete Sequences
About
Normalizing flows are a powerful class of generative models for continuous random variables, showing both strong model flexibility and the potential for non-autoregressive generation. These benefits are also desired when modeling discrete random variables such as text, but directly applying normalizing flows to discrete sequences poses significant additional challenges. We propose a VAE-based generative model which jointly learns a normalizing flow-based distribution in the latent space and a stochastic mapping to an observed discrete space. In this setting, we find that it is crucial for the flow-based distribution to be highly multimodal. To capture this property, we propose several normalizing flow architectures to maximize model flexibility. Experiments consider common discrete sequence tasks of character-level language modeling and polyphonic music generation. Our results indicate that an autoregressive flow-based model can match the performance of a comparable autoregressive baseline, and a non-autoregressive flow-based model can improve generation speed with a penalty to performance.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Character-level Language Modeling | enwik8 (test) | -- | 195 | |
| Character-level Language Modeling | text8 (test) | BPC1.62 | 128 | |
| Character-level Language Modeling | Penn Treebank (test) | BPC1.46 | 113 | |
| Character-level Language Modeling | Penn Treebank char-level (test) | BPC1.46 | 25 | |
| Language Modeling | text8 (test) | BPC1.88 | 21 | |
| Polyphonic music modeling | Nottingham (Nott) | NLL (nats)2.39 | 14 | |
| Polyphonic music modeling | JSB Chorales | Negative Log-Likelihood (nats)6.53 | 14 | |
| Polyphonic music modeling | Piano-midi.de | NLL (nats)7.77 | 12 | |
| Polyphonic music modeling | MuseData (Muse) | Negative Log-Likelihood (nats)6.92 | 12 |