Conformal Prediction for Zero-Shot Models

About

Vision-language models pre-trained at large scale have shown unprecedented adaptability and generalization to downstream tasks. Although its discriminative potential has been widely explored, its reliability and uncertainty are still overlooked. In this work, we investigate the capabilities of CLIP models under the split conformal prediction paradigm, which provides theoretical guarantees to black-box models based on a small, labeled calibration set. In contrast to the main body of literature on conformal predictors in vision classifiers, foundation models exhibit a particular characteristic: they are pre-trained on a one-time basis on an inaccessible source domain, different from the transferred task. This domain drift negatively affects the efficiency of the conformal sets and poses additional challenges. To alleviate this issue, we propose Conf-OT, a transfer learning setting that operates transductive over the combined calibration and query sets. Solving an optimal transport problem, the proposed method bridges the domain gap between pre-training and adaptation without requiring additional data splits but still maintaining coverage guarantees. We comprehensively explore this conformal prediction strategy on a broad span of 15 datasets and three non-conformity scores. Conf-OT provides consistent relative improvements of up to 20% on set efficiency while being 15 times faster than popular transductive approaches.

Julio Silva-Rodr\'iguez, Ismail Ben Ayed, Jose Dolz• 2025

Related benchmarks

Task	Dataset	Result
Conformal Inference	Average across 15 datasets (test)	Top-1 Accuracy81.1	60
Conformal Prediction	15 datasets (average)	Coverage90	39
Image Classification	11 datasets average CLIP ResNet-101 (test)	Acc66.1	18
Conformal Prediction	Average across 15 datasets (test)	Top-1 Acc66.7	12
Conformal Image Classification	Average across 11 datasets CLIP ViT-B/16 features (test)	Accuracy72	9
Conformal Prediction (APS nonconformity score)	SICAP v2 (test)	ACA53.1	7
Conformal Prediction (LAC nonconformity score)	SICAP v2 (test)	ACA53.1	7
Conformal Prediction (RAPS nonconformity score)	SICAPv2 (test)	ACA0.531	7

Showing 8 of 8 rows

Other info

Code

Follow for update

@wizwand_team Discord