Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PrivSyn: Differentially Private Data Synthesis

About

In differential privacy (DP), a challenging problem is to generate synthetic datasets that efficiently capture the useful information in the private data. The synthetic dataset enables any task to be done without privacy concern and modification to existing algorithms. In this paper, we present PrivSyn, the first automatic synthetic data generation method that can handle general tabular datasets (with 100 attributes and domain size $>2^{500}$). PrivSyn is composed of a new method to automatically and privately identify correlations in the data, and a novel method to generate sample data from a dense graphic model. We extensively evaluate different methods on multiple datasets to demonstrate the performance of our method.

Zhikun Zhang, Tianhao Wang, Ninghui Li, Jean Honorio, Michael Backes, Shibo He, Jiming Chen, Yang Zhang• 2020

Related benchmarks

TaskDatasetResultRank
Offline Reinforcement LearningKitchen Partial
Normalized Score0.2
62
Offline Reinforcement LearningMaze2D medium
Normalized Return31.6
38
Offline Reinforcement LearningMaze2D umaze
Normalized Return5.7
38
Offline Reinforcement LearningMaze2D large
Normalized Return3.4
33
Offline Reinforcement LearningMuJoCo HalfCheetah
Normalized Return2.4
33
ATE EstimationIHDP
Memory Consumption (MB)540.4
7
ATE EstimationLalonde
Memory Consumption (MB)535
7
ATE EstimationACIC
Memory Consumption (MB)572.2
7
ATE EstimationSynth
Memory Consumption (MB)537.6
7
Average Treatment Effect EstimationIHDP
Running Time (s)67.02
7
Showing 10 of 18 rows

Other info

Follow for update