Neural Autoregressive Flows for Markov Boundary Learning

About

Recovering Markov boundary -- the minimal set of variables that maximizes predictive performance for a response variable -- is crucial in many applications. While recent advances improve upon traditional constraint-based techniques by scoring local causal structures, they still rely on nonparametric estimators and heuristic searches, lacking theoretical guarantees for reliability. This paper investigates a framework for efficient Markov boundary discovery by integrating conditional entropy from information theory as a scoring criterion. We design a novel masked autoregressive network to capture complex dependencies. A parallelizable greedy search strategy in polynomial time is proposed, supported by analytical evidence. We also discuss how initializing a graph with learned Markov boundaries accelerates the convergence of causal discovery. Comprehensive evaluations on real-world and synthetic datasets demonstrate the scalability and superior performance of our method in both Markov boundary discovery and causal discovery tasks.

Khoa Nguyen, Bao Duong, Viet Huynh, Thin Nguyen• 2026

Related benchmarks

Task	Dataset	Result
Causal Discovery	Sachs real data d=11	SHD10	19
Markov boundary discovery	d30 Linear data	nDCG99.34	7
Markov boundary discovery	d30-G Nonlinear data	nDCG93.5	7
Markov boundary discovery	d30-MN Nonlinear data	nDCG90.67	7
Markov boundary discovery	Sachs Real Semi-real network	nDCG86.44	7
Markov boundary discovery	SynTReN Real Semi-real network	nDCG68.36	7
Causal Discovery	d30-G Nonlinear	Structural Hamming Distance (SHD)12.8	6
Causal Discovery	d30-MN Nonlinear	SHD18.2	6
Causal Discovery	d100 Nonlinear 1	SHD25.6	6
Causal Discovery	SynTReN Real Semi-real	SHD19	6

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord