Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Deep Kernel Learning

About

We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods. Specifically, we transform the inputs of a spectral mixture base kernel with a deep architecture, using local kernel interpolation, inducing points, and structure exploiting (Kronecker and Toeplitz) algebra for a scalable kernel representation. These closed-form kernels can be used as drop-in replacements for standard kernels, with benefits in expressive power and scalability. We jointly learn the properties of these kernels through the marginal likelihood of a Gaussian process. Inference and learning cost $O(n)$ for $n$ training points, and predictions cost $O(1)$ per test point. On a large and diverse collection of applications, including a dataset with 2 million examples, we show improved performance over scalable Gaussian processes with flexible kernel learning models, and stand-alone deep architectures.

Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, Eric P. Xing• 2015

Related benchmarks

TaskDatasetResultRank
GP regressionKernel Cookbook 1.0 (test)
MSE2.52e-4
35
Regressionelevators (test)
RMSE0.084
19
Zero-shot performance predictionUDPOS
MAE6.02
18
Zero-shot performance predictionXNLI
MAE2.16
18
Zero-shot performance predictionWikiAnn
MAE11.51
18
Few-shot regressionPeriodic functions in-range (test)
MSE2.08
10
RegressionProtein (test)
RMSE0.46
10
Zero-shot performance predictionTatoeba
MAE6.38
9
Performance PredictionPAWS
MAE1.27
9
Performance PredictionXQuAD
MAE4.13
9
Showing 10 of 28 rows

Other info

Follow for update