Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness
About
Bayesian neural networks (BNN) and deep ensembles are principled approaches to estimate the predictive uncertainty of a deep learning model. However their practicality in real-time, industrial-scale applications are limited due to their heavy memory and inference cost. This motivates us to study principled approaches to high-quality uncertainty estimation that require only a single deep neural network (DNN). By formalizing the uncertainty quantification as a minimax learning problem, we first identify input distance awareness, i.e., the model's ability to quantify the distance of a testing example from the training data in the input space, as a necessary condition for a DNN to achieve high-quality (i.e., minimax optimal) uncertainty estimation. We then propose Spectral-normalized Neural Gaussian Process (SNGP), a simple method that improves the distance-awareness ability of modern DNNs, by adding a weight normalization step during training and replacing the output layer with a Gaussian process. On a suite of vision and language understanding tasks and on modern architectures (Wide-ResNet and BERT), SNGP is competitive with deep ensembles in prediction, calibration and out-of-domain detection, and outperforms the other single-model approaches.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-10 (test) | -- | 3381 | |
| Molecular property prediction | MoleculeNet BBBP (scaffold) | ROC AUC69.1 | 117 | |
| Molecular property prediction | MoleculeNet SIDER (scaffold) | ROC-AUC0.568 | 97 | |
| Out-of-Distribution Detection | CIFAR-100 SVHN in-distribution out-of-distribution (test) | AUROC86.2 | 90 | |
| Molecular property prediction | MoleculeNet BACE (scaffold) | ROC-AUC78.6 | 87 | |
| Out-of-Distribution Detection | CIFAR-10 (ID) vs SVHN (OOD) (test) | AUROC93.87 | 79 | |
| Classification | CUB (test) | Accuracy61.27 | 79 | |
| OOD Detection | CIFAR-100 IND SVHN OOD | AUROC (%)82.26 | 74 | |
| Out-of-Distribution Detection | ImageNet-O | AUROC0.758 | 74 | |
| Molecular property prediction | MoleculeNet MUV (scaffold) | ROC-AUC0.762 | 68 |