Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Interpretable epistemic uncertainty decomposition in sequential generative models via polynomial chaos surrogates

About

Sequential generative models conditioned on uncertain rewards are central to AI-driven scientific discovery, yet the epistemic uncertainty they inherit from imperfect reward estimates remains unquantified. We propagate this uncertainty through generative flow networks (GFlowNets) by fitting polynomial chaos expansions (PCEs) to small ensembles of trained models. The PCE coefficients yield analytical Sobol sensitivity indices, providing the first interpretable decomposition of which reward components drive which generative decisions, a capability unavailable from deep ensembles, Bayesian neural networks, or Monte Carlo dropout. Convergence guarantees are established theoretically and four of five are formally verified in the Lean 4 proof assistant. Across three real-world tasks the framework reveals actionable structure invisible to ensembles alone. On the Doyle-Dreher Buchwald-Hartwig dataset catalyst selection is robust ($D_{\mathrm{catalyst}}\approx 71$) while additive selection is fragile ($D_{\mathrm{additive}}\approx 179$, $2.5\times$ higher). In fragment-based molecular design the linker position is the most sensitive ($D_{\mathrm{linker}}\approx 28$) while decoration positions are the most robust ($D\approx 14$-$18$), reversing the conventional scaffold-robust / decoration-fragile assumption. On the Sachs protein signalling network, MAPK-cascade edges and PKA/PKC hub edges separate into distinct sensitivity regimes, providing a targeted map for perturbation experiments. Calibration coverage at the 95% level reaches 0.97-1.00 across the dominant steps, and the surrogate evaluates 10{,}000 policy samples in milliseconds - $10^{3}$-$10^{4}\times$ faster than exhaustive retraining.

Ram\'on Nartallo-Kaluarachchi, Shashanka Ubaru, Ma{\l}gorzata J Zimo\'n, Dongsung Huh, Robert Manson-Sawko, Lior Horesh, Yoshua Bengio• 2025

Related benchmarks

TaskDatasetResultRank
Surrogate policy predictionSachs ensemble (test)
Policy MAE0.016
3
Surrogate Policy GenerationBuchwald-Hartwig
Policy MAE0.153
3
Surrogate Policy GenerationDiscrete grid
Ensemble Training Time0.5
1
Surrogate Policy GenerationContinuous grid
Ensemble Training Time (h)0.3
1
Surrogate Policy GenerationSymbolic Regression
Ensemble Training Time (h)0.75
1
Surrogate Policy GenerationLLM GFlowNet
Ensemble Training Time (h)0.5
1
Surrogate Policy GenerationSachs 11-node
Ensemble Training Time (h)0.5
1
Showing 7 of 7 rows

Other info

Follow for update