Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Differentially Private Histograms in the Shuffle Model from Fake Users

About

There has been much recent work in the shuffle model of differential privacy, particularly for approximate $d$-bin histograms. While these protocols achieve low error, the number of messages sent by each user -- the message complexity -- has so far scaled with $d$ or the privacy parameters. The message complexity is an informative predictor of a shuffle protocol's resource consumption. We present a protocol whose message complexity is two when there are sufficiently many users. The protocol essentially pairs each row in the dataset with a fake row and performs a simple randomization on all rows. We show that the error introduced by the protocol is small, using rigorous analysis as well as experiments on real-world data. We also prove that corrupt users have a relatively low impact on our protocol's estimates.

Albert Cheu, Maxim Zhilyaev• 2021

Related benchmarks

TaskDatasetResultRank
Frequency EstimationAOL
Relative Error7.1
6
Frequency EstimationSF_Sal
Relative Error5.81
6
Frequency EstimationBR_Sal
Relative Error (%)5.8
6
Frequency Estimation (Qhist)Synthetic Zipf distribution
Relative Error (w/o attacker)0.12
3
Frequency EstimationShuffle-DP Theoretical Analysis
Messages per User2
3
Frequency Estimation (Qhist)Synthetic Unif distribution
Relative Error (w/o Attacker)5.84
3
Frequency Estimation (Qhist)Synthetic Gauss distribution
Relative Error (No Attacker)5.5
3
Showing 7 of 7 rows

Other info

Follow for update