Inference on Optimal Policy Values and Other Irregular Functionals via Softmax Smoothing
About
Constructing confidence intervals for the value of an (unknown) optimal treatment policy is a fundamental problem in causal inference. Insight into the optimal policy value can guide the development of reward-maximizing, individualized treatment regimes. However, because the functional that defines the optimal value is non-differentiable, standard semi-parametric approaches for performing inference fail to be directly applicable. Many existing works circumvent non-differentiability by making the unrealistic assumption of zero probability of treatment non-response, i.e. that every unit responds (either positively or negatively) to an assigned treatment. Further, works that don't circumvent this restriction rely on refitting nuisance models a number of times proportional to the sample size. In this paper, we construct and analyze a simple, softmax smoothing-based estimator for the value of an optimal treatment policy. Our estimator applies in both static and dynamic treatment regimes, only requires fitting a constant number of nuisance models, and is statistically efficient when there is zero probability of non-response to treatment. Also, while our estimator does not require making semi-parametric restrictions, it can exploit them when they exist. We further show how our softmax smoothing approach can be used to estimate general parameters that are specified as a maximum of scores involving nuisance components, and look at conditional Balke and Pearl bounds and $L^1$ calibration error as salient examples.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Instrumental Variable Estimation | STAR Strong instrument math scores Small vs. Regular class sizes | Validity Score1 | 6 | |
| Partial identification of causal effects | Synthetic Binary-outcome ground-truth bounds known | Validity100 | 6 | |
| Partial identification of causal effects | Jobs semi-synthetic RCT-derived labels | Validity90 | 6 | |
| Causal effect estimation | STAR math scores Regular+Aide vs. Regular class sizes (Weak instrument ρ ≈ 0.28) | Validity1 | 6 | |
| Causal effect estimation | STAR math scores Regular+Aide vs. Regular class sizes (Strong instrument ρ ≈ 0.89) | Validity1 | 6 | |
| Causal effect estimation | Project STAR Reading scores Weak instrument | Validity1 | 6 | |
| Causal effect estimation | Project STAR Reading scores, Strong instrument | Validity1 | 6 | |
| Instrumental Variable Estimation | Airplane demand modified binary (n=2048 samples) | Validity1 | 6 | |
| Instrumental Variable Estimation | STAR math scores Small vs. Regular class sizes Weak instrument, ρ(Z, T) ≈ 0.29 | Validity100 | 6 | |
| Partial identification under instrumental variables | STAR small vs. regular class size reading scores Weak instrument ρ(Z, T) ≈ 0.29 | Validity1 | 6 |