Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration

About

An increasingly common use case for machine learning models is augmenting the abilities of human decision makers. For classification tasks where neither the human or model are perfectly accurate, a key step in obtaining high performance is combining their individual predictions in a manner that leverages their relative strengths. In this work, we develop a set of algorithms that combine the probabilistic output of a model with the class-level output of a human. We show theoretically that the accuracy of our combination model is driven not only by the individual human and model accuracies, but also by the model's confidence. Empirical results on image classification with CIFAR-10 and a subset of ImageNet demonstrate that such human-model combinations consistently have higher accuracies than the model or human alone, and that the parameters of the combination method can be estimated effectively with as few as ten labeled datapoints.

Gavin Kerrigan, Padhraic Smyth, Mark Steyvers• 2021

Related benchmarks

TaskDatasetResultRank
CalibrationCIFAR-10H
ECE0.84
52
Image Classification CalibrationImageNet 16H 1.0 (test)
ECE0.0197
35
Image ClassificationCIFAR-10H
Error Rate (%)2.22
25
ClassificationImageNet-16H noise level 80
Error Rate6.03
14
ClassificationImageNet-16H Noise Level 110
Error Rate11.62
14
ClassificationImageNet 16H Noise Level 95
Error Rate9.67
7
ClassificationImageNet 16H Noise Level 125
Error Rate22.6
7
Image ClassificationImageNet noise level 95 16H (test)
Error Rate7.89
7
Image ClassificationImageNet noise level 125 16H (test)
Error Rate19.45
7
Probability CalibrationImageNet 16H Noise Level 95
ECE3.23
7
Showing 10 of 12 rows

Other info

Code

Follow for update