Scalable Exact Inference in Multi-Output Gaussian Processes
About
Multi-output Gaussian processes (MOGPs) leverage the flexibility and interpretability of GPs while capturing structure across outputs, which is desirable, for example, in spatio-temporal modelling. The key problem with MOGPs is their computational scaling $O(n^3 p^3)$, which is cubic in the number of both inputs $n$ (e.g., time points or locations) and outputs $p$. For this reason, a popular class of MOGPs assumes that the data live around a low-dimensional linear subspace, reducing the complexity to $O(n^3 m^3)$. However, this cost is still cubic in the dimensionality of the subspace $m$, which is still prohibitively expensive for many applications. We propose the use of a sufficient statistic of the data to accelerate inference and learning in MOGPs with orthogonal bases. The method achieves linear scaling in $m$ in practice, allowing these models to scale to large $m$ without sacrificing significant expressivity or requiring approximation. This advance opens up a wide range of real-world tasks and can be combined with existing GP approximations in a plug-and-play way. We demonstrate the efficacy of the method on various synthetic and real-world data sets.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-Output Gaussian Process Regression | ERA5 block-wise splitting (test) | MSE0.142 | 7 | |
| EEG Prediction | EEG held-out (test) | MSE0.372 | 7 | |
| Spatiotemporal Prediction | ERA5 (random splitting) | MSE0.123 | 7 | |
| Multi-task Regression | Ship maintenance (test) | R20.994 | 5 | |
| Multi-task Regression | Sarcos | R20.984 | 5 | |
| Time Series Forecasting | Bramblemet | R20.068 | 5 | |
| Inverse Dynamics Prediction | SARCOS (test) | MSE0.14 | 5 | |
| Multi-task Regression | Neutronics | R20.999 | 3 |