Conformal Prediction for Zero-Shot Models
About
Vision-language models pre-trained at large scale have shown unprecedented adaptability and generalization to downstream tasks. Although its discriminative potential has been widely explored, its reliability and uncertainty are still overlooked. In this work, we investigate the capabilities of CLIP models under the split conformal prediction paradigm, which provides theoretical guarantees to black-box models based on a small, labeled calibration set. In contrast to the main body of literature on conformal predictors in vision classifiers, foundation models exhibit a particular characteristic: they are pre-trained on a one-time basis on an inaccessible source domain, different from the transferred task. This domain drift negatively affects the efficiency of the conformal sets and poses additional challenges. To alleviate this issue, we propose Conf-OT, a transfer learning setting that operates transductive over the combined calibration and query sets. Solving an optimal transport problem, the proposed method bridges the domain gap between pre-training and adaptation without requiring additional data splits but still maintaining coverage guarantees. We comprehensively explore this conformal prediction strategy on a broad span of 15 datasets and three non-conformity scores. Conf-OT provides consistent relative improvements of up to 20% on set efficiency while being 15 times faster than popular transductive approaches.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Conformal Inference | Average across 15 datasets (test) | Top-1 Accuracy81.1 | 60 | |
| Image Classification | 11 datasets average CLIP ResNet-101 (test) | Acc66.1 | 18 | |
| Conformal Prediction | 15 datasets (average) | Top-1 Accuracy66.7 | 15 | |
| Conformal Prediction | Average across 15 datasets (test) | Top-1 Acc66.7 | 12 | |
| Conformal Image Classification | Average across 11 datasets CLIP ViT-B/16 features (test) | Accuracy72 | 9 | |
| Conformal Prediction (APS nonconformity score) | SICAP v2 (test) | ACA53.1 | 7 | |
| Conformal Prediction (LAC nonconformity score) | SICAP v2 (test) | ACA53.1 | 7 | |
| Conformal Prediction (RAPS nonconformity score) | SICAPv2 (test) | ACA0.531 | 7 |