CoDi: Co-evolving Contrastive Diffusion Models for Mixed-type Tabular Synthesis

About

With growing attention to tabular data these days, the attempt to apply a synthetic table to various tasks has been expanded toward various scenarios. Owing to the recent advances in generative modeling, fake data generated by tabular data synthesis models become sophisticated and realistic. However, there still exists a difficulty in modeling discrete variables (columns) of tabular data. In this work, we propose to process continuous and discrete variables separately (but being conditioned on each other) by two diffusion models. The two diffusion models are co-evolved during training by reading conditions from each other. In order to further bind the diffusion models, moreover, we introduce a contrastive learning method with a negative sampling method. In our experiments with 11 real-world tabular datasets and 8 baseline methods, we prove the efficacy of the proposed method, called CoDi.

Chaejeong Lee, Jayoung Kim, Noseong Park• 2023

Related benchmarks

Task	Dataset	Result
Tabular Data Generation	DEFAULT	Beta Recall18.63	26
Tabular Data Generation	Shoppers	Beta Recall19.15	26
Tabular Data Generation	Adult	Beta Recall8.75	26
Tabular Data Generation	Beijing	Beta Recall53.77	25
Tabular Data Generation	Beijing	DCR-0021.00e-4	20
Tabular Data Generation	magic	DCR-00251.8	20
Tabular Data Privacy Evaluation	DEFAULT	DCR-0051.00e-4	19
Tabular Data Privacy Evaluation	Shoppers	DCR-0050.6759	19
Tabular Data Generation	News	DCR-0020.4976	18
Tabular Data Utility	Adult (test)	AUC0.829	18

Showing 10 of 77 rows

...

Other info

Follow for update

@wizwand_team Discord