A Rigorous Link between Deep Ensembles and (Variational) Bayesian Methods

About

We establish the first mathematically rigorous link between Bayesian, variational Bayesian, and ensemble methods. A key step towards this it to reformulate the non-convex optimisation problem typically encountered in deep learning as a convex optimisation in the space of probability measures. On a technical level, our contribution amounts to studying generalised variational inference through the lense of Wasserstein gradient flows. The result is a unified theory of various seemingly disconnected approaches that are commonly used for uncertainty quantification in deep learning -- including deep ensembles and (variational) Bayesian methods. This offers a fresh perspective on the reasons behind the success of deep ensembles over procedures based on parameterised variational inference, and allows the derivation of new ensembling schemes with convergence guarantees. We showcase this by proposing a family of interacting deep ensembles with direct parallels to the interactions of particle systems in thermodynamics, and use our theory to prove the convergence of these algorithms to a well-defined global minimiser on the space of probability measures.

Veit David Wild, Sahra Ghalebikesabi, Dino Sejdinovic, Jeremias Knoblauch• 2023

Related benchmarks

Task	Dataset	Result
Regression	UCI ENERGY (test)	Negative Log Likelihood2.43	62
Regression	UCI CONCRETE (test)	Neg Log Likelihood5.11	51
Regression	UCI POWER (test)	Negative Log Likelihood13.87	43
Regression	UCI NAVAL (test)	Negative Log Likelihood-3.04	42
Regression	UCI WINE (test)	Negative Log Likelihood7.13	38
Regression	UCI YACHT (test)	Negative Log Likelihood1.64	33
Regression	UCI KIN8NM (test)	NLL0.46	25
Regression	UCI PROTEIN (test)	Negative Log Likelihood43.2	8

Showing 8 of 8 rows

Other info

Code

Follow for update

@wizwand_team Discord