Distribution-Free, Risk-Controlling Prediction Sets
About
While improving prediction accuracy has been the focus of machine learning in recent years, this alone does not suffice for reliable decision-making. Deploying learning systems in consequential settings also requires calibrating and communicating the uncertainty of predictions. To convey instance-wise uncertainty for prediction tasks, we show how to generate set-valued predictions from a black-box predictor that control the expected loss on future test points at a user-specified level. Our approach provides explicit finite-sample guarantees for any dataset by using a holdout set to calibrate the size of the prediction sets. This framework enables simple, distribution-free, rigorous error control for many tasks, and we demonstrate it in five large-scale machine learning problems: (1) classification problems where some mistakes are more costly than others; (2) multi-label classification, where each observation has multiple associated labels; (3) classification problems where the labels have a hierarchical structure; (4) image segmentation, where we wish to predict a set of pixels containing an object of interest; and (5) protein structure prediction. Lastly, we discuss extensions to uncertainty quantification for ranking, metric learning and distributionally robust learning.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Selective Prediction | NyayaBench v2 | Guaranteed Test Coverage (alpha=0.20)14.4 | 9 | |
| Selective Prediction | MASSIVE (test) | Guaranteed Test Coverage (alpha=0.10)73.8 | 8 | |
| Selective Prediction | CLINC-150 v1 (test) | Performance (α=0.10)93.2 | 7 | |
| Selective Prediction | Banking77 ncal=6,468, delta=0.10, simulated confidence scores (test) | Accuracy (alpha=0.15)87.8 | 7 |