Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Incremental Transformer Neural Processes

About

Neural Processes (NPs), and specifically Transformer Neural Processes (TNPs), have demonstrated remarkable performance across tasks ranging from spatiotemporal forecasting to tabular data modelling. However, many of these applications are inherently sequential, involving continuous data streams such as real-time sensor readings or database updates. In such settings, models should support cheap, incremental updates rather than recomputing internal representations from scratch for every new observation -- a capability existing TNP variants lack. Drawing inspiration from Large Language Models, we introduce the Incremental TNP (incTNP). By leveraging causal masking, Key-Value (KV) caching, and a data-efficient autoregressive training strategy, incTNP matches the predictive performance of standard TNPs while reducing the computational cost of updates from quadratic to linear time complexity. We empirically evaluate our model on a range of synthetic and real-world tasks, including tabular regression and temperature prediction. Our results show that, surprisingly, incTNP delivers performance comparable to -- or better than -- non-causal TNPs while unlocking orders-of-magnitude speedups for sequential inference. Finally, we assess the consistency of the model's updates -- by adapting a metric of ``implicit Bayesianness", we show that incTNP retains a prediction rule as implicitly Bayesian as standard non-causal TNPs, demonstrating that incTNP achieves the computational benefits of causal masking without sacrificing the consistency required for streaming inference.

Philip Mortimer, Cristiana Diaconu, Tommy Rochussen, Bruno Mlodozeniec, Richard E. Turner• 2026

Related benchmarks

TaskDatasetResultRank
Regressionelevators (test)--
19
RegressionProtein (test)
Test Log Likelihood0.0363
18
RegressionSkillcraft (test)
Log Likelihood (Test)0.0078
17
ForecastingHADISD Forecast (test)
Log-Likelihood0.6897
11
InterpolationHADISD Interp (test)
Log-Likelihood0.0182
11
Regression1D GP (test)
Log-Likelihood0.002
11
RegressionPowerplant (test)
Log-Likelihood0.0028
10
RegressionTabular Synthetic (test)
Log-Likelihood0.161
10
Forecasting24-hour window OOD (test)
Avg Test Log-Likelihood0.1536
6
Autoregressive Prediction1D GP
Avg Test Log-Likelihood0.769
5
Showing 10 of 16 rows

Other info

Follow for update