QGFN: Controllable Greediness with Action Values

About

Generative Flow Networks (GFlowNets; GFNs) are a family of energy-based generative methods for combinatorial objects, capable of generating diverse and high-utility samples. However, consistently biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value estimate, $Q$, to create greedier sampling policies which can be controlled by a mixing parameter. We show that several variants of the proposed method, QGFN, are able to improve on the number of high-reward samples generated in a variety of tasks without sacrificing diversity.

Elaine Lau, Stephen Zhewen Lu, Ling Pan, Doina Precup, Emmanuel Bengio• 2024

Related benchmarks

Task	Dataset	Result	Rank
Molecule Design	Molecule Design (test)	Mode Coverage (R>7.5)8.57e+3		26
Maximum Independent Set	Small graphs (test)	MIS Size18.26		4

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord