Learning plug-in surrogate endpoints for randomized experiments
About
Surrogate endpoints are used in place of long-term outcomes in randomized experiments when observing the real outcome for a large enough cohort is prohibitively expensive or impractical. A short-term surrogate is good if the result of an experiment using the surrogate is predictive of the result of a hypothetical study using the real outcome. Much attention has been paid to formalizing this property in causal terms, but most criteria are unidentifiable and cannot be turned into practical algorithms for learning surrogate endpoints from data. To address this, we study plug-in composite surrogates, functions of post-treatment variables that may be substituted directly for the primary outcome in a randomized experiment. We propose two methods for learning plug-in surrogates that maximize effect predictiveness, and characterize the possibility of finding endpoints that yield unbiased effect estimates in representative scenarios. Finally, in both synthetic experiments with known effects and in data from a real-world experiment, we find that our method, based on directly modeling the surrogate effect, returns plug-in endpoints more predictive of the primary effect than established methods.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Average Treatment Effect Estimation | IHDP | Mean Error1.35 | 12 | |
| Surrogate Learning | Synthetic Scenario Case b | MAE0.09 | 10 | |
| Surrogate Learning | Synthetic Scenario Case c | MAE0.2 | 10 | |
| Surrogate Learning | Synthetic Scenario Case d Linear | MAE0.03 | 10 | |
| Treatment Effect Estimation | Linear synthetic scenario Case b | MAE0.01 | 10 | |
| Treatment Effect Estimation | Linear synthetic scenario Case d | MAE0.03 | 10 | |
| Surrogate Learning | Synthetic Scenario Case a | MAE0.5 | 10 | |
| Surrogate Learning | Synthetic Scenario Case d | MAE0.4 | 10 | |
| Treatment Effect Estimation | Linear synthetic scenario Case c | MAE0.09 | 10 | |
| Treatment Effect Estimation | Linear synthetic scenario Case a | MAE0.03 | 10 |