Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reinforced sequential Monte Carlo for amortised sampling

About

This paper proposes a synergy of amortised and particle-based methods for sampling from distributions defined by unnormalised density functions. We state a connection between sequential Monte Carlo (SMC) and neural sequential samplers trained by maximum-entropy reinforcement learning (MaxEnt RL), wherein learnt sampling policies and value functions define proposal kernels and twist functions. Exploiting this connection, we introduce an off-policy RL training procedure for the sampler that uses samples from SMC -- using the learnt sampler as a proposal -- as a behaviour policy that better explores the target distribution. We describe techniques for stable joint training of proposals and twist functions and an adaptive weight tempering scheme to reduce training signal variance. Furthermore, building upon past attempts to use experience replay to guide the training of neural samplers, we derive a way to combine historical samples with annealed importance sampling weights within a replay buffer. On synthetic multi-modal targets (in both continuous and discrete spaces) and the Boltzmann distribution of alanine dipeptide conformations, we demonstrate improvements in approximating the true distribution as well as training stability compared to both amortised and Monte Carlo methods.

Sanghyeok Choi, Sarthak Mittal, V\'ictor Elvira, Jinkyoo Park, Esmeralda S. Whitammer• 2025

Related benchmarks

TaskDatasetResultRank
Target Distribution SamplingFunnel 10D
Sinkhorn Distance113.2
29
Sampling on discretised synthetic densitiesManywell d = 32
Sinkhorn Dist.21.91
15
Amortised SamplingMoS d = 50
Sinkhorn Cost2.02e+3
13
Amortised SamplingRobot4 d = 10
Sinkhorn Distance0.39
12
Amortised SamplingGMM40 d = 50
Sinkhorn Distance3.58e+3
12
Amortised SamplingManyWell d = 64
MMD0.043
10
Amortised SamplingGMM40 d = 2
Sinkhorn Distance6.46
7
Amortised SamplingGMM40 d=5
Sinkhorn Distance83.3
7
Chemical sequence designsEH
ELBO52.537
6
biological sequence designL14-RNA1
ELBO17.225
6
Showing 10 of 13 rows

Other info

Follow for update