Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TabularARGN: A Flexible and Efficient Auto-Regressive Framework for Generating High-Fidelity Synthetic Data

About

Synthetic data generation for tabular datasets must balance fidelity, efficiency, and versatility to meet the demands of real-world applications. We introduce the Tabular Auto-Regressive Generative Network (TabularARGN), a flexible framework designed to handle mixed-type, multivariate, and sequential datasets. By training on all possible conditional probabilities, TabularARGN supports advanced features such as fairness-aware generation, imputation, and conditional generation on any subset of columns. The framework achieves state-of-the-art synthetic data quality while significantly reducing training and inference times, making it ideal for large-scale datasets with diverse structures. Evaluated across established benchmarks, including realistic datasets with complex relationships, TabularARGN demonstrates its capability to synthesize high-quality data efficiently. By unifying flexibility and performance, this framework paves the way for practical synthetic data generation across industries.

Paul Tiwald, Ivona Krchova, Andrey Sidorenko, Mariana Vargas Vieyra, Mario Scriminaci, Michael Platzer• 2025

Related benchmarks

TaskDatasetResultRank
Tabular Data SynthesisAdult
Shape Similarity0.985
17
Tabular Data SynthesisDiabetes
Shapes0.989
15
Privacy EvaluationAdult--
10
Privacy EvaluationDiabetes--
9
Synthetic Data DetectionAdult
Overall Score0.733
7
Synthetic Data UtilityAdult
Overall Score97.1
7
Synthetic Data DetectionDiabetes
Overall Score78.9
6
Synthetic Data UtilityDiabetes
Overall Score98
6
Privacy EvaluationElectric Vehicles
Overall Score0.998
4
Synthetic Data UtilityElectric Vehicles
Overall Score97.8
4
Showing 10 of 20 rows

Other info

Follow for update