Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Joint Model and Data Sparsification via the Marginal Likelihood

About

Sparse recovery in linear systems underpins applications from signal processing to high-dimensional regression. Sparse Bayesian Learning, grounded in the principle of automatic relevance determination (ARD), offers a practical Bayesian mechanism for feature sparsity via marginal likelihood optimization. Yet, its reliance on a homoscedastic noise model renders it sensitive to data contaminations such as outliers or misspecified noise, harming model fit and predictions. Instead, we propose jointly learning individual feature and sample relevancies, enabling simultaneous model and data sparsification via a single Bayesian objective. This symmetric pruning of model and data offers a natural extension that preserves conjugacy, admits closed-form updates for standard optimization procedures, and aligns with perspectives from robust regression and influence functions. Empirical results across diverse regression tasks affirm that a joint ARD approach consistently yields both sparse and robust prediction models.

Alexander Timans, Thomas M\"ollenhoff, Christian A. Naesseth, Mohammad Emtiyaz Khan, Eric Nalisnick• 2026

Related benchmarks

TaskDatasetResultRank
Kernel regressionBoston 20% (test)
RMSE3.303
28
Kernel regressionBoston 20% n=506 (test)
NLL2.593
20
RegressionBoston (test)
NLL2.931
12
RegressionPower 10% outlier contamination (test)
RMSE4.19
11
RegressionKin8nm 10% outlier contamination (test)
RMSE0.136
11
RegressionElevators 10% outlier contamination (test)
RMSE0.003
11
RegressionYacht 10% outlier contamination
RMSE3.75
11
RegressionConcrete 10% outlier contamination
RMSE7.55
11
RegressionKin8nm (no contamination)
RMSE0.132
11
RegressionElevators (no contamination)
RMSE0.003
11
Showing 10 of 22 rows

Other info

Follow for update