Soft Calibration Objectives for Neural Networks
About
Optimal decision making requires that classifiers produce uncertainty estimates consistent with their empirical accuracy. However, deep neural networks are often under- or over-confident in their predictions. Consequently, methods have been developed to improve the calibration of their predictive uncertainty both during training and post-hoc. In this work, we propose differentiable losses to improve calibration based on a soft (continuous) version of the binning operation underlying popular calibration-error estimators. When incorporated into training, these soft calibration losses achieve state-of-the-art single-model ECE across multiple datasets with less than 1% decrease in accuracy. For instance, we observe an 82% reduction in ECE (70% relative to the post-hoc rescaled ECE) in exchange for a 0.7% relative decrease in accuracy relative to the cross entropy baseline on CIFAR-100. When incorporated post-training, the soft-binning-based calibration error objective improves upon temperature scaling, a popular recalibration method. Overall, experiments across losses and datasets demonstrate that using calibration-sensitive procedures yield better uncertainty estimates under dataset shift than the standard practice of using a cross entropy loss and post-hoc recalibration methods.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Calibration | USPS | ECE9.32 | 57 | |
| Top-label Confidence Calibration | MNIST | ECE5.19 | 42 | |
| Image Classification Calibration | PACS Photo | ECE9.88 | 39 | |
| Selective Classification | CIFAR-100 (test) | AUC0.9271 | 32 | |
| Top-label Confidence Calibration | SVHN | ECE62.2 | 30 | |
| Class-wise Calibration | MNIST | CwECE3.23 | 30 | |
| Image Classification Calibration | PACS Art | ECE22.8 | 30 | |
| Image Classification Calibration | PACS Cartoon | ECE18.7 | 30 | |
| Image Classification Calibration | PACS Sketch | ECE19.6 | 30 | |
| Calibration | CIFAR-10 5000-sample half (test) | ECE0.0135 | 23 |