Robust Calibration with Multi-domain Temperature Scaling

About

Uncertainty quantification is essential for the reliable deployment of machine learning models to high-stakes application domains. Uncertainty quantification is all the more challenging when training distribution and test distribution are different, even the distribution shifts are mild. Despite the ubiquity of distribution shifts in real-world applications, existing uncertainty quantification approaches mainly study the in-distribution setting where the train and test distributions are the same. In this paper, we develop a systematic calibration model to handle distribution shifts by leveraging data from multiple domains. Our proposed method -- multi-domain temperature scaling -- uses the heterogeneity in the domains to improve calibration robustness under distribution shift. Through experiments on three benchmark data sets, we find our proposed method outperforms existing methods as measured on both in-distribution and out-of-distribution test sets.

Yaodong Yu, Stephen Bates, Yi Ma, Michael I. Jordan• 2022

Related benchmarks

Task	Dataset	Result
Calibration	ImageNet-C OOD-domains	ECE (%)1.43	24
Calibration	ImageNet-C (InD-domains)	ECE (%)1.06	24
Calibration	WILDS-RxRx1 (InD-domains)	ECE1.87	18
Calibration	WILDS-RxRx1 (OOD-domains)	ECE0.0277	18
Calibration	GLD InD-domains v2	ECE (%)2.61	18
Calibration	GLD OOD-domains v2	ECE3.41	18
Model Performance Prediction	WILDS-RxRx1 (OOD-domains)	MAE4.76	12
Model Performance Prediction	ImageNet-C OOD-domains	MAE1.66	6
Model Performance Prediction	WILDS-RxRx1 (InD-domains)	MAE1.39	6
Model Performance Prediction	GLDv2 InD-domains	MAE4.64	6

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord