Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mix-n-Match: Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning

About

This paper studies the problem of post-hoc calibration of machine learning classifiers. We introduce the following desiderata for uncertainty calibration: (a) accuracy-preserving, (b) data-efficient, and (c) high expressive power. We show that none of the existing methods satisfy all three requirements, and demonstrate how Mix-n-Match calibration strategies (i.e., ensemble and composition) can help achieve remarkably better data-efficiency and expressive power while provably maintaining the classification accuracy of the original classifier. Mix-n-Match strategies are generic in the sense that they can be used to improve the performance of any off-the-shelf calibrator. We also reveal potential issues in standard evaluation practices. Popular approaches (e.g., histogram-based expected calibration error (ECE)) may provide misleading results especially in small-data regime. Therefore, we propose an alternative data-efficient kernel density-based estimator for a reliable evaluation of the calibration performance and prove its asymptotically unbiasedness and consistency. Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks in most of the experimental settings. Our codes are available at https://github.com/zhang64-llnl/Mix-n-Match-Calibration.

Jize Zhang, Bhavya Kailkhura, T. Yong-Jin Han• 2020

Related benchmarks

TaskDatasetResultRank
Long-Tailed Image ClassificationImageNet-LT (test)--
220
Node ClassificationComputers--
143
Confidence calibrationCIFAR-100-LT (test)
ECE0.021
53
Model CalibrationCIFAR-10
ECE1.64
40
Model CalibrationSVHN
ECE3.33
40
Confidence calibrationCiteseer
ECE4.15
36
Confidence calibrationCora
ECE3.45
36
Confidence calibrationPubmed
ECE1.63
36
CalibrationMNIST
ECE0.21
33
Class-incremental learningCIFAR-100 (test)
Accuracy56.25
30
Showing 10 of 49 rows

Other info

Follow for update