Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Learning plug-in surrogate endpoints for randomized experiments

About

Surrogate endpoints are used in place of long-term outcomes in randomized experiments when observing the real outcome for a large enough cohort is prohibitively expensive or impractical. A short-term surrogate is good if the result of an experiment using the surrogate is predictive of the result of a hypothetical study using the real outcome. Much attention has been paid to formalizing this property in causal terms, but most criteria are unidentifiable and cannot be turned into practical algorithms for learning surrogate endpoints from data. To address this, we study plug-in composite surrogates, functions of post-treatment variables that may be substituted directly for the primary outcome in a randomized experiment. We propose two methods for learning plug-in surrogates that maximize effect predictiveness, and characterize the possibility of finding endpoints that yield unbiased effect estimates in representative scenarios. Finally, in both synthetic experiments with known effects and in data from a real-world experiment, we find that our method, based on directly modeling the surrogate effect, returns plug-in endpoints more predictive of the primary effect than established methods.

Alessandro-Umberto Margueritte, Ahmet Zahid Balc{\i}o\u{g}lu, Jesse Krijthe, Dave Zachariah, Fredrik D. Johansson• 2026

Related benchmarks

TaskDatasetResultRank
Average Treatment Effect EstimationIHDP
Mean Error1.35
12
Surrogate LearningSynthetic Scenario Case b
MAE0.09
10
Surrogate LearningSynthetic Scenario Case c
MAE0.2
10
Surrogate LearningSynthetic Scenario Case d Linear
MAE0.03
10
Treatment Effect EstimationLinear synthetic scenario Case b
MAE0.01
10
Treatment Effect EstimationLinear synthetic scenario Case d
MAE0.03
10
Surrogate LearningSynthetic Scenario Case a
MAE0.5
10
Surrogate LearningSynthetic Scenario Case d
MAE0.4
10
Treatment Effect EstimationLinear synthetic scenario Case c
MAE0.09
10
Treatment Effect EstimationLinear synthetic scenario Case a
MAE0.03
10
Showing 10 of 10 rows

Other info

Follow for update