Test Time Training for Supervised Causal Learning

About

Supervised Causal Learning (SCL) has shown promise in causal discovery by framing it as a supervised learning problem. However, it suffers from significant out-of-distribution generalization challenges. We reveal three limitations of previous SCL practices: a significant performance gap between synthetic benchmarks and real-world data, fragility to distribution shifts, and failure in compositional generalization, collectively questioning its real-world applicability. To address this, we propose Test-Time Training for Supervised Causal Learning (TTT-SCL), a novel framework that dynamically generates training sets explicitly aligned with any specific test instance. We demonstrate the correlation between TTT-SCL and score-based methods, and design an efficient module for generating training sets based on the classic scoring function. Experiments on synthetic benchmarks, pseudo-real and real-world datasets demonstrate that TTT-SCL significantly outperforms existing SCL and traditional causal discovery methods.

Zizhen Deng, Jiaru Zhang, Rui Ding, Huang Bojun, Jinzhuo Wang, Qiang Fu, Shi Han, Dongmei Zhang• 2026

Related benchmarks

Task	Dataset	Result
Causal Discovery	Syntren	F1 Score32.14	22
Causal Discovery	Sachs	F1 Score56.4	14
Edge Prediction	Linear_U	AUROC86.3	9
Edge Prediction	Chebyshev_G	AUROC83	9
Edge Prediction	Sachs	AUROC78.9	9
Edge Prediction	Syntren	AUROC80.1	9
Edge Prediction	RFF_G	AUROC91.8	9
Causal Discovery	Asia bnlearn	AUROC91	6
Causal Discovery	Cancer bnlearn	AUROC91.6	6
Causal Discovery	Earthquake bnlearn repository	AUROC98.8	6

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord