Merging uncertainty sets via majority vote
About
Given $K$ uncertainty sets that are arbitrarily dependent -- for example, confidence intervals for an unknown parameter obtained with $K$ different estimators, or prediction sets obtained via conformal prediction based on $K$ different algorithms on shared data -- we address the question of how to efficiently combine them in a black-box manner to produce a single uncertainty set. We present a simple and broadly applicable majority vote procedure that produces a merged set with nearly the same error guarantee as the input sets. We then extend this core idea in a few ways: we show that weighted averaging can be a powerful way to incorporate prior information, and a simple randomization trick produces strictly smaller merged sets without altering the coverage guarantee. Further improvements can be obtained if the sets are exchangeable. We also show that many modern methods, like split conformal prediction, median of means, HulC and cross-fitted ``double machine learning'', can be effectively derandomized using these ideas.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Conformal Prediction | CIFAR-10 (test) | -- | 21 | |
| Regression | OpenML 361247 | Coverage98 | 12 | |
| Regression | OpenML 361249 | Coverage96.4 | 12 | |
| Regression | OpenML 361235 | Coverage96.9 | 12 | |
| Regression | OpenML 361244 | Coverage96.8 | 12 | |
| Regression | OpenML 361243 | Coverage96.3 | 12 | |
| Conformal Prediction | MNIST | Coverage (alpha=0.025)99.4 | 11 | |
| Regression | OpenML dataset 361236 | Coverage95.7 | 6 | |
| Regression | OpenML dataset 361242 (N=21263, d=81) | Coverage95.5 | 6 | |
| Regression | OpenML 361234 | Coverage95.1 | 6 |