Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

On the Optimal Number of Grids for Differentially Private Non-Interactive $K$-Means Clustering

About

Differentially private $K$-means clustering enables releasing cluster centers derived from a dataset while protecting the privacy of the individuals. Non-interactive clustering techniques based on privatized histograms are attractive because the released data synopsis can be reused for other downstream tasks without additional privacy loss. The choice of the number of grids for discretizing the data points is crucial, as it directly controls the quantization bias and the amount of noise injected to preserve privacy. The widely adopted strategy selects a grid size that is independent of the number of clusters and also relies on empirical tuning. In this work, we revisit this choice and propose a refined grid-size selection rule derived by minimizing an upper bound on the expected deviation in the K-means objective function, leading to a more principled discretization strategy for non-interactive private clustering. Compared to prior work, our grid resolution differs both in its dependence on the number of clusters and in the scaling with dataset size and privacy budget. Extensive numerical results elucidate that the proposed strategy results in accurate clustering compared to the state-of-the-art techniques, even under tight privacy budgets.

Gokularam Muthukrishnan, Anshoo Tandon• 2026

Related benchmarks

TaskDatasetResultRank
k-means clusteringSynthetic d=2, K=2, N=100
WCSS1.537
25
k-means clusteringSynthetic (d=2, K=2, N=200)
WCSS2.696
25
k-means clusteringSynthetic d=2, K=4, N=200
WCSS0.844
25
k-means clusteringSynthetic (d=2, K=4, N=400)
WCSS1.515
25
k-means clusteringSynthetic (d=2, K=8, N=400)
WCSS0.481
25
k-means clusteringSynthetic d=2, K=8, N=800
WCSS0.794
25
k-means clusteringSynthetic (d=3, K=2, N=200)
WCSS4.838
25
k-means clusteringSynthetic (d=3, K=2, N=400)
WCSS8.447
25
k-means clusteringSynthetic (d=3, K=4, N=200)
WCSS2.184
25
k-means clusteringSynthetic d=3, K=4, N=400
WCSS3.345
25
Showing 10 of 18 rows

Other info

Follow for update