Generative Modeling of Discrete Data Using Geometric Latent Subspaces
About
We introduce the use of latent subspaces in the exponential parameter space of product manifolds of categorial distributions, as a tool for learning generative models of discrete data. The low-dimensional latent space encodes statistical dependencies and removes redundant degrees of freedom among the categorial variables. We equip the parameter domain with a Riemannian geometry such that the spaces and distances are related by isometries which enables consistent flow matching. In particular, geodesics become straight lines which makes model training by flow matching effective. Empirical results demonstrate that reduced latent dimensions suffice to represent data for generative modeling.
Daniel Gonzalez-Alvarado, Jonas Cassel, Stefania Petra, Christoph Schn\"orr• 2026
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| DNA Sequence Modeling | Promoter (test) | -- | 7 | |
| Generative Modeling | Enhancer Melanoma (test) | FBD31.2 | 4 | |
| Generative Modeling | ENHANCER FLY BRAIN (test) | FBD17 | 4 |
Showing 3 of 3 rows