Interpretable epistemic uncertainty decomposition in sequential generative models via polynomial chaos surrogates
About
Sequential generative models conditioned on uncertain rewards are central to AI-driven scientific discovery, yet the epistemic uncertainty they inherit from imperfect reward estimates remains unquantified. We propagate this uncertainty through generative flow networks (GFlowNets) by fitting polynomial chaos expansions (PCEs) to small ensembles of trained models. The PCE coefficients yield analytical Sobol sensitivity indices, providing the first interpretable decomposition of which reward components drive which generative decisions, a capability unavailable from deep ensembles, Bayesian neural networks, or Monte Carlo dropout. Convergence guarantees are established theoretically and four of five are formally verified in the Lean 4 proof assistant. Across three real-world tasks the framework reveals actionable structure invisible to ensembles alone. On the Doyle-Dreher Buchwald-Hartwig dataset catalyst selection is robust ($D_{\mathrm{catalyst}}\approx 71$) while additive selection is fragile ($D_{\mathrm{additive}}\approx 179$, $2.5\times$ higher). In fragment-based molecular design the linker position is the most sensitive ($D_{\mathrm{linker}}\approx 28$) while decoration positions are the most robust ($D\approx 14$-$18$), reversing the conventional scaffold-robust / decoration-fragile assumption. On the Sachs protein signalling network, MAPK-cascade edges and PKA/PKC hub edges separate into distinct sensitivity regimes, providing a targeted map for perturbation experiments. Calibration coverage at the 95% level reaches 0.97-1.00 across the dominant steps, and the surrogate evaluates 10{,}000 policy samples in milliseconds - $10^{3}$-$10^{4}\times$ faster than exhaustive retraining.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Surrogate policy prediction | Sachs ensemble (test) | Policy MAE0.016 | 3 | |
| Surrogate Policy Generation | Buchwald-Hartwig | Policy MAE0.153 | 3 | |
| Surrogate Policy Generation | Discrete grid | Ensemble Training Time0.5 | 1 | |
| Surrogate Policy Generation | Continuous grid | Ensemble Training Time (h)0.3 | 1 | |
| Surrogate Policy Generation | Symbolic Regression | Ensemble Training Time (h)0.75 | 1 | |
| Surrogate Policy Generation | LLM GFlowNet | Ensemble Training Time (h)0.5 | 1 | |
| Surrogate Policy Generation | Sachs 11-node | Ensemble Training Time (h)0.5 | 1 |