Cluster Frequency Conformal Prediction for Local Coverage
About
Conformal prediction provides distribution-free coverage guarantees, but in many-class classification it may still under-cover specific classes or subpopulations, preventing safe deployment in high-stakes applications. We propose Cluster Frequency Conformal Prediction (CFCP), a plug-in framework that adapts conformal prediction to local structure in a learned representation space. CFCP clusters learned embeddings, estimates cluster-level label-frequency distributions from calibration data, and for each test point constructs a sample-specific probability vector by softly mixing nearby cluster distributions regularized with global-prior and reliability-aware shrinkage. This vector is then conformalized using standard set constructors. In the disjoint-split regime, CFCP inherits standard finite-sample marginal validity. Under additional assumptions, CFCP further admits a local-validity interpretation. Since representation clusters aggregate locally similar samples, their empirical class frequencies provide a stable estimate of local label ambiguity. Across image and text benchmarks, CFCP achieves the best class coverage in 15/16 dataset/score-family comparisons and a competitive prediction set size efficiency, with several settings substantially more efficient. Overall, our results show that cluster-frequency information provides an effective localized signal for improving classwise reliability in many-class conformal prediction.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Classification | WOS-46985 | WUC0.048 | 24 | |
| Conformal Prediction | CIFAR-100 (five repeated splits) | Class Coverage59.8 | 24 | |
| Image Classification | CIFAR-100 | Coverage59.8 | 24 | |
| Image Classification | ImageNet | Coverage67.2 | 24 | |
| Image Classification | ImageNet V2 | Coverage (Cov)71.4 | 24 | |
| Text Classification | WOS-46985 | Coverage64.5 | 24 | |
| Classification | CIFAR-100 | WUC0.025 | 24 | |
| Classification | ImageNet V2 | WUC0.046 | 24 | |
| Classification | ImageNet | WUC0.03 | 24 | |
| Conformal Prediction | ImageNet (five repeated splits) | Class Coverage67.2 | 21 |