Test Time Training for Supervised Causal Learning
About
Supervised Causal Learning (SCL) has shown promise in causal discovery by framing it as a supervised learning problem. However, it suffers from significant out-of-distribution generalization challenges. We reveal three limitations of previous SCL practices: a significant performance gap between synthetic benchmarks and real-world data, fragility to distribution shifts, and failure in compositional generalization, collectively questioning its real-world applicability. To address this, we propose Test-Time Training for Supervised Causal Learning (TTT-SCL), a novel framework that dynamically generates training sets explicitly aligned with any specific test instance. We demonstrate the correlation between TTT-SCL and score-based methods, and design an efficient module for generating training sets based on the classic scoring function. Experiments on synthetic benchmarks, pseudo-real and real-world datasets demonstrate that TTT-SCL significantly outperforms existing SCL and traditional causal discovery methods.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Causal Discovery | Syntren | F1 Score32.14 | 22 | |
| Edge Prediction | Linear_U | AUROC86.3 | 9 | |
| Edge Prediction | Chebyshev_G | AUROC83 | 9 | |
| Edge Prediction | Sachs | AUROC78.9 | 9 | |
| Edge Prediction | Syntren | AUROC80.1 | 9 | |
| Edge Prediction | RFF_G | AUROC91.8 | 9 | |
| Causal Discovery | Asia bnlearn | AUROC91 | 6 | |
| Causal Discovery | Cancer bnlearn | AUROC91.6 | 6 | |
| Causal Discovery | Earthquake bnlearn repository | AUROC98.8 | 6 | |
| Causal Discovery | Survey bnlearn | AUROC95.5 | 6 |