Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Privacy Preserving Diffusion Models for Mixed-Type Tabular Data Generation

About

We introduce DP-FinDiff, a differentially private diffusion framework for synthesizing mixed-type tabular data. DP-FinDiff employs embedding-based representations for categorical features, reducing encoding overhead and scaling to high-dimensional datasets. To adapt DP-training to the diffusion process, we propose two privacy-aware training strategies: an adaptive timestep sampler that aligns updates with diffusion dynamics, and a feature-aggregated loss that mitigates clipping-induced bias. Together, these enhancements improve fidelity and downstream utility without weakening privacy guarantees. On financial and medical datasets, DP-FinDiff achieves 16-42% higher utility than DP baselines at comparable privacy levels, demonstrating its promise for safe and effective data sharing in sensitive domains.

Timur Sattarov, Marco Schreyer, Damian Borth• 2025

Related benchmarks

TaskDatasetResultRank
ClassificationCredit
ROCAUC69.7
50
ClassificationAdult
ROCAUC0.792
40
Binary ClassificationDiabetes
AUC0.584
34
Binary Classificationbank-marketing
AUC0.804
19
Object ClassificationPayments
ROC AUC0.799
12
Showing 5 of 5 rows

Other info

Follow for update