Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CoDi: Co-evolving Contrastive Diffusion Models for Mixed-type Tabular Synthesis

About

With growing attention to tabular data these days, the attempt to apply a synthetic table to various tasks has been expanded toward various scenarios. Owing to the recent advances in generative modeling, fake data generated by tabular data synthesis models become sophisticated and realistic. However, there still exists a difficulty in modeling discrete variables (columns) of tabular data. In this work, we propose to process continuous and discrete variables separately (but being conditioned on each other) by two diffusion models. The two diffusion models are co-evolved during training by reading conditions from each other. In order to further bind the diffusion models, moreover, we introduce a contrastive learning method with a negative sampling method. In our experiments with 11 real-world tabular datasets and 8 baseline methods, we prove the efficacy of the proposed method, called CoDi.

Chaejeong Lee, Jayoung Kim, Noseong Park• 2023

Related benchmarks

TaskDatasetResultRank
Tabular Data UtilityMagic (test)
AUC0.931
14
Tabular Data UtilityCalifornia (test)
AUC0.981
14
Tabular Data UtilityAdult (test)
AUC0.829
14
Tabular Data UtilityDefault (test)
AUC0.497
14
Tabular Data UtilityShoppers (test)
AUC0.855
13
Tabular Data SynthesisAggregate of five tabular datasets (full train vs original train)
Marginal Error21.7
13
Tabular Data GenerationAdult (test)
MLE0.871
12
Tabular Data GenerationMagic (test)
MLE0.932
12
Tabular Data GenerationShoppers (test)
MLE0.865
12
Tabular Data GenerationBeijing (test)
MLE0.818
12
Showing 10 of 11 rows

Other info

Follow for update