Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

IV-ICL: Bounding Causal Effects with Instrumental Variables via In-Context Learning

About

The instrumental-variables (IV) setting is standard for partial identification of causal effects when unobserved confounding makes point identification impossible. Existing approaches face methodological bottlenecks: closed-form bound estimands are required -- e.g., Balke-Pearl equations in binary IV -- and even when available, designing accurate estimators requires manual effort tailored to each estimand. While direct Bayesian inference of the causal effects, instead of the bounds, circumvents these challenges, it is often computationally intensive and suffers from high prior sensitivity or under-dispersed posteriors. As a remedy, we introduce IV-ICL, an amortized Bayesian in-context learning method that learns the marginal posterior distribution of the causal effects directly and derives bounds as its quantiles. Unlike standard variational inference that optimizes exclusive KL divergence, amortized Bayesian inference minimizes the expected inclusive KL, a mass-covering objective. We empirically observe that optimizing inclusive KL can recover the entire identified set across diverse data-generating processes, while exclusive-KL (e.g. with variational inference) of the same Bayesian formulation collapses onto a single mode and fails to cover the identified set. We evaluate IV-ICL on synthetic and semi-synthetic IV benchmarks and show it produces intervals that are more reliably valid and more informative compared to efficient semi-parametric, Bayesian, and plug-in baselines, at 20-500x lower inference time. Beyond methodology, we propose a procedure to convert randomized controlled trials into IV benchmarks with provably preserved ground-truth causal effects that enables a more realistic evaluation of partial-identification methods.

Vahid Balazadeh, Hamidreza Kamkari, Medha Barath, Ricardo Silva, Rahul G. Krishnan• 2026

Related benchmarks

TaskDatasetResultRank
Causal effect estimationSTAR math scores Regular+Aide vs. Regular class sizes (Weak instrument ρ ≈ 0.28)
Validity1
6
Causal effect estimationSTAR math scores Regular+Aide vs. Regular class sizes (Strong instrument ρ ≈ 0.89)
Validity1
6
Causal effect estimationProject STAR Reading scores Weak instrument
Validity1
6
Causal effect estimationProject STAR Reading scores, Strong instrument
Validity1
6
Instrumental Variable EstimationAirplane demand modified binary (n=2048 samples)
Validity1
6
Instrumental Variable EstimationSTAR math scores Small vs. Regular class sizes Weak instrument, ρ(Z, T) ≈ 0.29
Validity100
6
Instrumental Variable EstimationSTAR Strong instrument math scores Small vs. Regular class sizes
Validity Score1
6
Partial identification of causal effectsSynthetic Binary-outcome ground-truth bounds known
Validity100
6
Partial identification of causal effectsJobs semi-synthetic RCT-derived labels
Validity100
6
Partial identification under instrumental variablesSTAR small vs. regular class size reading scores Weak instrument ρ(Z, T) ≈ 0.29
Validity1
6
Showing 10 of 11 rows

Other info

Follow for update