QGFN: Controllable Greediness with Action Values
About
Generative Flow Networks (GFlowNets; GFNs) are a family of energy-based generative methods for combinatorial objects, capable of generating diverse and high-utility samples. However, consistently biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value estimate, $Q$, to create greedier sampling policies which can be controlled by a mixing parameter. We show that several variants of the proposed method, QGFN, are able to improve on the number of high-reward samples generated in a variety of tasks without sacrificing diversity.
Elaine Lau, Stephen Zhewen Lu, Ling Pan, Doina Precup, Emmanuel Bengio• 2024
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Molecule Design | Molecule Design (test) | Mode Coverage (R>7.5)8.57e+3 | 26 | |
| Maximum Independent Set | Small graphs (test) | MIS Size18.26 | 4 |
Showing 2 of 2 rows