Learning Boltzmann Generators via Constrained Mass Transport
About
Efficient sampling from high-dimensional and multimodal unnormalized probability distributions is a central challenge in many areas of science and machine learning. We focus on Boltzmann generators (BGs) that aim to sample the Boltzmann distribution of physical systems, such as molecules, at a given temperature. Classical variational approaches that minimize the reverse Kullback-Leibler divergence are prone to mode collapse, while annealing-based methods, commonly using geometric schedules, can suffer from mass teleportation and rely heavily on schedule tuning. We introduce Constrained Mass Transport (CMT), a variational framework that generates intermediate distributions under constraints on both the KL divergence and the entropy decay between successive steps. These constraints enhance distributional overlap, mitigate mass teleportation, and counteract premature convergence. Across standard BG benchmarks and the here introduced ELIL tetrapeptide, the largest system studied to date without access to samples from molecular dynamics, CMT consistently surpasses state-of-the-art variational methods, achieving more than 2.5x higher effective sample size while avoiding mode collapse.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Boltzmann distribution modeling | ELIL Tetrapeptide d=219 (test) | Target Evaluations8 | 5 | |
| Boltzmann distribution modeling | Alanine Hexapeptide d=180 (test) | Target Evaluations4 | 5 | |
| Boltzmann distribution modeling | Alanine Dipeptide d=60 (test) | Target Evaluations1 | 5 | |
| Boltzmann distribution modeling | Alanine Tetrapeptide d=120 (test) | Evaluation Scale1 | 5 | |
| Boltzmann Generation | Alanine Dipeptide d = 60 | NLL214.4 | 5 | |
| Molecular distribution matching | Alanine Dipeptide | Ram-T-W20.059 | 3 | |
| Molecular distribution matching | Alanine Tetra-peptide | Ram-T-W20.492 | 3 | |
| Molecular distribution matching | ELIL Tetrapeptide | Ram-T-W20.631 | 3 | |
| Molecular distribution matching | Alanine Hexa-peptide | Ram-T-W20.833 | 3 | |
| Molecular Simulation Distribution Matching | ELIL Tetrapeptide | TICA KL Divergence8.58 | 2 |