Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Easy Differentially Private Linear Regression

About

Linear regression is a fundamental tool for statistical analysis. This has motivated the development of linear regression methods that also satisfy differential privacy and thus guarantee that the learned model reveals little about any one data point used to construct it. However, existing differentially private solutions assume that the end user can easily specify good data bounds and hyperparameters. Both present significant practical obstacles. In this paper, we study an algorithm which uses the exponential mechanism to select a model with high Tukey depth from a collection of non-private regression models. Given $n$ samples of $d$-dimensional data used to train $m$ models, we construct an efficient analogue using an approximate Tukey depth that runs in time $O(d^2n + dm\log(m))$. We find that this algorithm obtains strong empirical performance in the data-rich setting with no data bounds or hyperparameter selection required.

Kareem Amin, Matthew Joseph, M\'onica Ribero, Sergei Vassilvitskii• 2022

Related benchmarks

TaskDatasetResultRank
RegressionD3
Average Relative MSE1.169
11
RegressionD5
Average Relative MSE1.058
11
RegressionD1
Average Relative MSE1.4
10
RegressionD2
Average Relative MSE1.326
10
RegressionD6
Average Relative MSE0.908
7
RegressionD8
Average Relative MSE1.079
7
RegressionD4
Average Relative MSE0.898
7
RegressionD7
Average Relative MSE0.614
7
Synthetic Data GenerationAuction Verification
Average Runtime (seconds)82.9825
6
Synthetic Data GenerationAbalone Age
Average Runtime (seconds)130.1
6
Showing 10 of 16 rows

Other info

Follow for update