Deep Kernel Learning

About

We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods. Specifically, we transform the inputs of a spectral mixture base kernel with a deep architecture, using local kernel interpolation, inducing points, and structure exploiting (Kronecker and Toeplitz) algebra for a scalable kernel representation. These closed-form kernels can be used as drop-in replacements for standard kernels, with benefits in expressive power and scalability. We jointly learn the properties of these kernels through the marginal likelihood of a Gaussian process. Inference and learning cost $O(n)$ for $n$ training points, and predictions cost $O(1)$ per test point. On a large and diverse collection of applications, including a dataset with 2 million examples, we show improved performance over scalable Gaussian processes with flexible kernel learning models, and stand-alone deep architectures.

Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, Eric P. Xing• 2015

Related benchmarks

Task	Dataset	Result
Classification	COIL-20	Accuracy0.99	96
Regression	UCI ENERGY (test)	Negative Log Likelihood0.45	62
Regression	Boston UCI (test)	--	36
GP regression	Kernel Cookbook 1.0 (test)	MSE2.52e-4	35
Regression	elevators (test)	RMSE0.084	19
Zero-shot performance prediction	UDPOS	MAE6.02	18
Zero-shot performance prediction	XNLI	MAE2.16	18
Zero-shot performance prediction	WikiAnn	MAE11.51	18
Regression	Protein (test)	--	15
Probabilistic Forecasting	Weather (test)	CRPS0.124	12

Showing 10 of 62 rows

Other info

Follow for update

@wizwand_team Discord