Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Calibrating Scientific Foundation Models with Inference-Time Stochastic Attention

About

Transformer-based scientific foundation models are increasingly deployed in high-stakes settings, but current architectures give deterministic outputs and provide limited support for calibrated predictive uncertainty. We propose Stochastic Attention, a sample average lightweight inference-time modification that randomizes attention by replacing softmax weights with normalized multinomial samples controlled by a single concentration parameter, and produces predictive ensembles without retraining. To set this parameter, we introduce a calibration objective that matches the stochastic attention output with the target, yielding an efficient univariate post-hoc tuning problem. We evaluate this mechanism on scientific foundation models for weather and time-series forecasting, as well as several regression tasks. Across benchmarks against uncertainty-aware baselines, we find that Sample Average Stochastic Attention achieves the strongest native calibration and the sharpest prediction intervals at comparable calibration, with adaptation costs nearly three orders of magnitude lower than the next-best baseline.

Akash Yadav, Taiwo A. Adebiyi, Ruda Zhang• 2026

Related benchmarks

TaskDatasetResultRank
RegressionUCI Concrete
Sharpness1.881
12
RegressionUCI Naval
Sharpness Score1
12
RegressionUCI Energy
W1 Score0.03
6
RegressionUCI Kin8nm
W1 Error0.07
6
RegressionUCI Yacht
W1 Error0.021
6
RegressionUCI Energy
Sharp/SA1
6
RegressionUCI Kin8nm
Sharp/SA1
6
RegressionUCI Protein
Sharp/SA Error1
6
RegressionUCI Wine
Sharp/SA1
6
RegressionUCI Yacht
Sharpness (SA)1
6
Showing 10 of 28 rows

Other info

Follow for update