Direct Bethe Free Energy Minimization for Bayesian Neural Networks

About

Bayesian neural networks are typically trained on the evidence lower bound (ELBO), which keeps the joint likelihood but pays a Jensen gap at every observation. We train by local consistency instead: direct minimisation of the Bethe free energy, whose data term pays no gap-it scores each observation exactly by its predictive density, a strictly proper rule, for any likelihood with a tractable predictive convolution. Our departure is free routing: the beliefs are trained as free parameters of this objective, jointly with the backbone, rather than bound to the conjugate posterior computed in closed form (closed routing). Instantiated with a Gaussian last layer over a deterministic backbone, exact inference appears as the known, closed-routed corner-the neural-linear marginal likelihood, evidence-optimal, keeping the joint. A shared cavity instead trades it for a batchable per-plate predictive score, and free routing reaches that score's optimum-unattainable under the binding whenever the noise is heteroscedastic-improving NLL and calibration over the exact corner. This instance, SCROLL (Shared-Cavity fRee-rOuting Last-Layer), is a single-pass, any-likelihood Bayesian neural network, implicitly empirical-Bayes-prior precision, observation noise, covariance, and backbone fit in one gradient pass. At a single training run and forward pass per architecture-where the conventional references cross-validate their regularisation weight and ensembles pay 5-50x at inference-a fixed SCROLL variant is best-or-tied on NLL and calibration on 7/8 UCI regression benchmarks, and best on 4/5 across three large tabular datasets and two frozen text/vision embeddings.

Pavel Prochazka• 2026

Related benchmarks

Task	Dataset	Result
Regression	UCI ENERGY (test)	Negative Log Likelihood0.838	71
Regression	UCI CONCRETE (test)	Neg Log Likelihood3.416	51
Regression	Boston UCI (test)	--	45
Regression	UCI POWER (test)	Negative Log Likelihood2.875	43
Regression	UCI NAVAL (test)	Negative Log Likelihood-3.141	42
Regression	UCI WINE (test)	Negative Log Likelihood0.959	38
Regression	Naval	RMSE0.009	25
Regression	Kin8nm	RMSE0.165	24
Regression	Energy	RMSE0.571	24
Regression	UCI Yacht	NLL3.4	20

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord