Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reducing cross-sample prediction churn in scientific machine learning

About

Scientific machine learning reports predictive performance. It does not report whether the same prediction would survive a different draw of training data. Across $9$ chemistry benchmarks, two classifiers trained on independent bootstraps of the same training set agree on aggregate accuracy to within $1.3\text{--}4.2$ percentage points but disagree on the class label of $8.0\text{--}21.8\%$ of test molecules. We call this gap \emph{cross-sample prediction churn}. The standard parameter-side techniques (deep ensembles, MC dropout, stochastic weight averaging) do not reduce this gap; two data-side methods do. The first is $K$-bootstrap bagging, which cuts the rate $40\text{--}54\%$ on every dataset at no accuracy cost ($K{\times}$-ERM compute). The second is \emph{twin-bootstrap}, our proposal: two networks trained jointly on independent bootstraps with a sym-KL consistency loss between their predictions, which at matched $2{\times}$-ERM compute reduces churn a further median $45\%$ beyond bagging-$K{=}2$. Cross-sample prediction churn deserves a column alongside predictive performance in scientific-ML benchmark reports, because without it the parameter-side and data-side methods are indistinguishable on the metric they actually differ on.

Gordan Prastalo, Kevin Maik Jablonka• 2026

Related benchmarks

TaskDatasetResultRank
Distributional disagreement (sym-KL)CYP2D6-Sub (id-test)
Delta Sym-KL-0.63
5
Distributional disagreement (sym-KL)Pgp (id-test)
Delta Symmetric KL Divergence (delta sym-KL)-0.5
5
Distributional disagreement (sym-KL)BACE (dev)
Delta sym-KL-0.74
5
Distributional disagreement (sym-KL)TADF (id-test)
Delta Symmetric KL Divergence (delta sym-KL)-0.39
5
Distributional disagreement (sym-KL)MOF-thermal (id-test)
Delta sym-KL-0.38
5
Distributional disagreement (sym-KL)BBBP (id-test)
Delta Sym-KL Divergence-0.46
5
Distributional disagreement (sym-KL)AMES (id-test)
Delta Sym-KL-1.11
5
Molecular Property ClassificationCYP2D6-Sub
∆ ID-Churn-9.3
5
Molecular Property ClassificationBBB-Martins
Delta ID Churn-7.4
5
Molecular Property ClassificationAMES
Delta ID Churn-8.3
5
Showing 10 of 33 rows

Other info

Follow for update