Off-policy estimation of linear functionals: Non-asymptotic theory for semi-parametric efficiency
About
The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures. We analyze a broad class of two-stage procedures that first estimate the treatment effect function, and then use this quantity to estimate the linear functional. We prove non-asymptotic upper bounds on the mean-squared error of such procedures: these bounds reveal that in order to obtain non-asymptotically optimal procedures, the error in estimating the treatment effect should be minimized in a certain weighted $L^2$-norm. We analyze a two-stage procedure based on constrained regression in this weighted norm, and establish its instance-dependent optimality in finite samples via matching non-asymptotic local minimax lower bounds. These results show that the optimal non-asymptotic risk, in addition to depending on the asymptotically efficient variance, depends on the weighted norm distance between the true outcome function and its approximation by the richest function class supported by the sample size.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Treatment Effect Estimation | JOBS semi-synthetic (test) | MSE0.0011 | 22 | |
| Treatment Effect Estimation | RORCO semi-synthetic | MSE0.0032 | 22 | |
| Treatment Effect Estimation | ACIC semi-synthetic 2016 (test) | Mean Error0.0036 | 22 | |
| Treatment Effect Estimation | RORCO Real | Mean Error-0.0138 | 22 | |
| Treatment Effect Estimation | ACIC semi-synthetic 2017 | Mean TEE Error0.0048 | 22 | |
| Treatment Effect Estimation | NEWS semi-synthetic (test) | MSE5.30e-4 | 22 | |
| Treatment Effect Estimation | NEWS semi-synthetic | Mean Error5.30e-4 | 22 | |
| Causal Inference | IHDP | MSE0.544 | 20 | |
| Treatment Effect Estimation | TWINS | Mean Effect0.0086 | 15 |