Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Bin-wise Temperature Scaling (BTS): Improvement in Confidence Calibration Performance through Simple Scaling Techniques

About

The prediction reliability of neural networks is important in many applications. Specifically, in safety-critical domains, such as cancer prediction or autonomous driving, a reliable confidence of model's prediction is critical for the interpretation of the results. Modern deep neural networks have achieved a significant improvement in performance for many different image classification tasks. However, these networks tend to be poorly calibrated in terms of output confidence. Temperature scaling is an efficient post-processing-based calibration scheme and obtains well calibrated results. In this study, we leverage the concept of temperature scaling to build a sophisticated bin-wise scaling. Furthermore, we adopt augmentation of validation samples for elaborated scaling. The proposed methods consistently improve calibration performance with various datasets and deep convolutional neural network models.

Byeongmoon Ji, Hyemin Jung, Jihyeun Yoon, Kyungyul Kim, Younghak Shin• 2019

Related benchmarks

TaskDatasetResultRank
Image Classification CalibrationCIFAR100
Classwise ECE0.0387
99
CalibrationTabular datasets
NLL0.324
21
Image Classification CalibrationImageNet
Accuracy78.99
15
Text ClassificationIMDB binary sentiment (five random splits)
NLL0.301
11
Text ClassificationEmotion multi-class (five random splits)
NLL0.156
9
Image Classification CalibrationBloodMNIST
NLL0.3305
9
Showing 6 of 6 rows

Other info

Follow for update