ReDi: Rectified Discrete Flow
About
Discrete Flow-based Models (DFMs) are powerful generative models for high-quality discrete data but typically suffer from slow sampling speeds due to their reliance on iterative decoding processes. This reliance on a multi-step process originates from the factorization approximation of DFMs, which is necessary for handling high-dimensional data. In this paper, we analyze the factorization approximation error using Conditional Total Correlation (TC), and reveal its dependence on the coupling. To address the challenge of efficient few-step generation, we propose Rectified Discrete Flow (ReDi), a novel iterative method that reduces the underlying factorization error (measured as Conditional TC) by rectifying the coupling between source and target distributions. We theoretically prove that each ReDi step guarantees a monotonic decreasing Conditional TC, ensuring its convergence. Empirically, ReDi significantly reduces Conditional TC and enables few-step generation. Moreover, we demonstrate that the rectified couplings are well-suited for training efficient one-step models on image generation. ReDi offers a simple and theoretically grounded approach for tackling the few-step challenge, providing a new perspective on efficient discrete data synthesis. Code is available at https://github.com/Ugness/ReDi_discrete.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Code Generation | HumanEval (test) | -- | 444 | |
| Mathematical Reasoning | MATH500 (test) | Accuracy41 | 381 | |
| Code Generation | MBPP (test) | -- | 276 | |
| Image Generation | MNIST Binary (test) | FID6.49 | 98 | |
| Image Generation | CIFAR-10 | FID121.1 | 88 | |
| Molecular Generation | ZINC250K | Uniqueness900 | 68 | |
| Molecule Generation | ZINC 250k 2012 | Validity Score900.1 | 56 | |
| Molecule Generation | QM9 2014 (test) | Uniqueness956.1 | 56 | |
| Molecule Generation | QM9 2014 | Novelty Score126.8 | 56 | |
| Mathematical Reasoning | GSM8K (test) | Accuracy0.7362 | 48 |