Learning Sparse Nonparametric DAGs
About
We develop a framework for learning sparse nonparametric directed acyclic graphs (DAGs) from data. Our approach is based on a recent algebraic characterization of DAGs that led to a fully continuous program for score-based learning of DAG models parametrized by a linear structural equation model (SEM). We extend this algebraic characterization to nonparametric SEM by leveraging nonparametric sparsity based on partial derivatives, resulting in a continuous optimization problem that can be applied to a variety of nonparametric and semiparametric models including GLMs, additive noise models, and index models as special cases. Unlike existing approaches that require specific modeling choices, loss functions, or algorithms, we present a completely general framework that can be applied to general nonlinear models (e.g. without additive noise), general differentiable loss functions, and generic black-box optimization routines. The code is available at https://github.com/xunzheng/notears.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| DAG Structure Recovery | non-linear-1 5000 samples | SHD5.2 | 48 | |
| Causal Discovery | Synthetic Temporal Sequences | SHD6.36 | 40 | |
| Causal Discovery | non-linear-2 d=10, 5000 samples (test) | SHD5.4 | 12 | |
| Causal Discovery | non-linear-2 (d=20, 5000 samples) (test) | SHD13.8 | 12 | |
| Causal Structure Learning | Linear Synthetic Data d=10 5000 samples | SHD4.6 | 12 | |
| Causal Structure Learning | Linear Synthetic Data d=20, 5000 samples | SHD7.6 | 12 | |
| Causal Discovery | non-linear-2 d=50, 5000 samples (test) | Structural Hamming Distance30.4 | 12 | |
| Causal Discovery | non-linear-2 d=100, 5000 samples (test) | Structural Hamming Distance (SHD)85.6 | 12 | |
| Causal Structure Learning | Linear Synthetic Data d=50, 5000 samples | SHD29.6 | 12 | |
| Causal Structure Learning | Linear Synthetic Data d=100, 5000 samples | SHD74 | 12 |