Automatic Laplace Collapsed Sampling: Scalable Marginalisation of Latent Parameters via Automatic Differentiation
About
We present Automatic Laplace Collapsed Sampling (ALCS), a general framework for marginalising latent parameters in Bayesian models using automatic differentiation, which we combine with nested sampling to explore the hyperparameter space in a robust and efficient manner. At each nested sampling likelihood evaluation, ALCS collapses the high-dimensional latent variables $z$ to a scalar contribution via maximum a posteriori (MAP) optimisation and a Laplace approximation, both computed using autodiff. This reduces the effective dimension from $d_\theta + d_z$ to just $d_\theta$, making Bayesian evidence computation tractable for high-dimensional settings without hand-derived gradients or Hessians, and with minimal model-specific engineering. The MAP optimisation and Hessian evaluation are parallelised across live points on GPU-hardware, making the method practical at scale. We also show that automatic differentiation enables local approximations beyond Laplace to parametric families such as the Student-$t$, which improves evidence estimates for heavy-tailed latents. We validate ALCS on a suite of benchmarks spanning hierarchical, time-series, and discrete-likelihood models and establish where the Gaussian approximation holds. This enables a post-hoc ESS diagnostic that localises failures across hyperparameter space without expensive joint sampling.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Posterior Inference | Inference Gym Eight Schools | Delta8 | 1 | |
| Posterior Inference | Inference Gym Brownian Motion T = 50 | Delta Error0.06 | 1 | |
| Posterior Inference | Inference Gym LGCP (M = 100) | Delta0.12 | 1 | |
| Posterior Inference | Inference Gym SV SP500 (T = 100) | Delta0.32 | 1 | |
| Posterior Inference | Inference Gym IRT Ns = 400 | ESS/K Ratio0.1 | 1 | |
| Bayesian Inference | Eight Schools | -- | 1 | |
| Bayesian Inference | Radon J = 85 | -- | 1 | |
| Bayesian Inference | Brownian Motion (T = 50) | -- | 1 | |
| Bayesian Inference | LGCP M = 100 | -- | 1 | |
| Bayesian Inference | SV T = 100 | -- | 1 |