Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Controlling Exploration-Exploitation in GFlowNets via Markov Chain Perspectives

About

Generative Flow Network (GFlowNet) objectives implicitly fix an equal mixing of forward and backward policies, potentially constraining the exploration-exploitation trade-off during training. By further exploring the link between GFlowNets and Markov chains, we establish an equivalence between GFlowNet objectives and Markov chain reversibility, thereby revealing the origin of such constraints, and provide a framework for adapting Markov chain properties to GFlowNets. Building on these theoretical findings, we propose $\alpha$-GFNs, which generalize the mixing via a tunable parameter $\alpha$. This generalization enables direct control over exploration-exploitation dynamics to enhance mode discovery capabilities, while ensuring convergence to unique flows. Across various benchmarks, including Set, Bit Sequence, and Molecule Generation, $\alpha$-GFN objectives consistently outperform previous GFlowNet objectives, achieving up to a $10 \times$ increase in the number of discovered modes.

Lin Chen, Samuel Drapeau, Fanghao Shao, Xuekai Zhu, Bo Xue, Yunchong Song, Mathieu Lauri\`ere, Zhouhan Lin• 2026

Related benchmarks

TaskDatasetResultRank
Bit Sequence GenerationBit Sequence Generation k=4
Modes59.4
10
Bit Sequence GenerationBit Sequence Generation k=8
Modes60
10
Bit Sequence GenerationBit Sequence Generation k=10
Modes37.4
10
Molecule GenerationMolecule Generation
Modes40.2
10
Set GenerationSet Generation Small
Modes90
10
Set GenerationSet Generation Medium
Modes499
10
Set GenerationSet Generation Large
Modes2.24e+3
10
Bit Sequence GenerationBit Sequence Generation k=2
Modes60
10
Bit Sequence GenerationBit Sequence Generation k=6
Modes47.8
10
Showing 9 of 9 rows

Other info

Follow for update