SNAP: Sequential Non-Ancestor Pruning for Targeted Causal Effect Estimation With an Unknown Graph

About

Causal discovery can be computationally demanding for large numbers of variables. If we only wish to estimate the causal effects on a small subset of target variables, we might not need to learn the causal graph for all variables, but only a small subgraph that includes the targets and their adjustment sets. In this paper, we focus on identifying causal effects between target variables in a computationally and statistically efficient way. This task combines causal discovery and effect estimation, aligning the discovery objective with the effects to be estimated. We show that definite non-ancestors of the targets are unnecessary to learn causal relations between the targets and to identify efficient adjustments sets. We sequentially identify and prune these definite non-ancestors with our Sequential Non-Ancestor Pruning (SNAP) framework, which can be used either as a preprocessing step to standard causal discovery methods, or as a standalone sound and complete causal discovery algorithm. Our results on synthetic and real data show that both approaches substantially reduce the number of independence tests and the computation time without compromising the quality of causal effect estimations.

M\'aty\'as Schubert, Tom Claassen, Sara Magliacane• 2025

Related benchmarks

Task	Dataset	Result
Causal Structure Learning	Synthetic nD=10000, d=2, dmax=10, 100 nodes	CI Test Count5.11	13
Causal Structure Learning	Synthetic nD=10000, d=2, dmax=10, 200 nodes	Number of CI Tests20.02	13
Causal Structure Learning	Synthetic nD=10000, d=2, dmax=10, 400 nodes	CI Test Count79.98	13
Causal Structure Learning	Synthetic nD=10000, d=2, dmax=10, 800 nodes	Number of CI tests3.20e+5	11
Causal Structure Learning	Synthetic nD=10000, d=2, dmax=10, 600 nodes	Number of CI tests179.9	11
Causal Discovery	Binary data 10 nodes, nD=1000, d=2, dmax=10	Number of CI tests92.52	7
Causal Discovery	Binary data 20 nodes, nD=1000, d=2, dmax=10	Number of CI Tests242.8	7
Local Causal Discovery	Linear Gaussian 100 nodes	CI Test Count (x10^3)5.01e+3	7
Local Causal Discovery	Linear Gaussian 200 nodes	CI Test Count (x10^3)19.93	7
Local Causal Discovery	Linear Gaussian 400 nodes	Number of CI tests (x10^3)79.81	7

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord