CausalPFN: Amortized Causal Effect Estimation via In-Context Learning

About

Causal effect estimation from observational data is fundamental across various applications. However, selecting an appropriate estimator from dozens of specialized methods demands substantial manual effort and domain expertise. We present CausalPFN, a single transformer that amortizes this workflow: trained once on a large library of simulated data-generating processes that satisfy ignorability, it infers causal effects for new observational datasets out of the box. CausalPFN combines ideas from Bayesian causal inference with the large-scale training protocol of prior-fitted networks (PFNs), learning to map raw observations directly to causal effects without any task-specific adjustment. Our approach achieves superior average performance on heterogeneous and average treatment effect estimation benchmarks (IHDP, Lalonde, ACIC). Moreover, it shows competitive performance for real-world policy making on uplift modeling tasks. CausalPFN provides calibrated uncertainty estimates to support reliable decision-making based on Bayesian principles. This ready-to-use model requires no further training or tuning and takes a step toward automated causal inference (https://github.com/vdblm/CausalPFN/).

Vahid Balazadeh, Hamidreza Kamkari, Valentin Thomas, Benson Li, Junwei Ma, Jesse C. Cresswell, Rahul G. Krishnan• 2025

Related benchmarks

Task	Dataset	Result
Average Treatment Effect (ATE) Estimation	IHDP, ACIC, Lalonde CPS PSID 2016	ATE Error (IHDP)0.2	13
Causal Inference	IHDP	d_TV (Total Variation Distance)0.239	12
Conditional Average Treatment Effect (CATE) Estimation	IHDP, ACIC 2016, Lalonde CPS, Lalonde PSID	IHDP Error Metric0.58	12
Uplift Modeling	Lenta 50k stratified (subsample)	Normalized Qini Score1	5
Uplift Modeling	Hillstrom 64k rows (full)	Normalized Qini Score99.2	5
Uplift Modeling	Criteo stratified 50k (subsample)	Normalized Qini Score85.9	5
Uplift Modeling	Hillstrom Hill(2) 64k rows (full)	Normalized Qini Score0.968	5
Uplift Modeling	Megafon Mega 50k stratified (subsample)	Normalized Qini0.97	5
Uplift Modeling	Retail Hero X5 50k stratified (subsample)	Normalized Qini Score0.922	5

Showing 9 of 9 rows

Other info

Code

Follow for update

@wizwand_team Discord