Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Benchmarking Estimators for Natural Experiments: A Novel Dataset and a Doubly Robust Algorithm

About

Estimating the effect of treatments from natural experiments, where treatments are pre-assigned, is an important and well-studied problem. We introduce a novel natural experiment dataset obtained from an early childhood literacy nonprofit. Surprisingly, applying over 20 established estimators to the dataset produces inconsistent results in evaluating the nonprofit's efficacy. To address this, we create a benchmark to evaluate estimator accuracy using synthetic outcomes, whose design was guided by domain experts. The benchmark extensively explores performance as real world conditions like sample size, treatment correlation, and propensity score accuracy vary. Based on our benchmark, we observe that the class of doubly robust treatment effect estimators, which are based on simple and intuitive regression adjustment, generally outperform other more complicated estimators by orders of magnitude. To better support our theoretical understanding of doubly robust estimators, we derive a closed form expression for the variance of any such estimator that uses dataset splitting to obtain an unbiased estimate. This expression motivates the design of a new doubly robust estimator that uses a novel loss function when fitting functions for regression adjustment. We release the dataset and benchmark in a Python package; the package is built in a modular way to facilitate new datasets and estimators.

R. Teal Witter, Christopher Musco• 2024

Related benchmarks

TaskDatasetResultRank
Treatment Effect EstimationRORCO semi-synthetic
MSE1.07e-5
22
Treatment Effect EstimationACIC semi-synthetic 2016 (test)
Mean Error1.03e-4
22
Treatment Effect EstimationJOBS semi-synthetic (test)
MSE1.40e-4
22
Treatment Effect EstimationNEWS semi-synthetic (test)
MSE1.33e-7
22
Treatment Effect EstimationACIC semi-synthetic 2017
Mean TEE Error6.61e-5
22
Treatment Effect EstimationNEWS semi-synthetic
Mean Error1.33e-7
22
Treatment Effect EstimationRORCO Real
Mean Error-0.0572
22
Causal InferenceIHDP
MSE0.201
20
Treatment Effect EstimationTWINS
Mean Effect0.01
15
Showing 9 of 9 rows

Other info

Code

Follow for update