DOCTOR: A Simple Method for Detecting Misclassification Errors

About

Deep neural networks (DNNs) have shown to perform very well on large scale object recognition problems and lead to widespread use for real-world applications, including situations where DNN are implemented as "black boxes". A promising approach to secure their use is to accept decisions that are likely to be correct while discarding the others. In this work, we propose DOCTOR, a simple method that aims to identify whether the prediction of a DNN classifier should (or should not) be trusted so that, consequently, it would be possible to accept it or to reject it. Two scenarios are investigated: Totally Black Box (TBB) where only the soft-predictions are available and Partially Black Box (PBB) where gradient-propagation to perform input pre-processing is allowed. Empirically, we show that DOCTOR outperforms all state-of-the-art methods on various well-known images and sentiment analysis datasets. In particular, we observe a reduction of up to $4\%$ of the false rejection rate (FRR) in the PBB scenario. DOCTOR can be applied to any pre-trained model, it does not require prior information about the underlying dataset and is as simple as the simplest available methods in the literature.

Federica Granese, Marco Romanelli, Daniele Gorla, Catuscia Palamidessi, Pablo Piantanida• 2021

Related benchmarks

Task	Dataset	Result
Out-of-Distribution Detection	CIFAR-10 (in-distribution) TinyImageNet (out-of-distribution) (test)	AUROC98.9	79
Out-of-Distribution Detection	CIFAR-10 in-distribution LSUN out-of-distribution (test)	AUROC98.6	73
Out-of-Distribution Detection	CIFAR100 (ID) vs SVHN (OOD) (test)	AUROC91	67
Out-of-Distribution Detection	CIFAR100 (in) CIFAR10 (out)	AUROC76.8	57
Selective Classification	Unseen Cls. shift	AURC0.3684	48
Error detection	In-distribution (test)	AUC0.89	40
OOD Detection	CoComageNet	Detection AUC0.7249	40
Error detection	Average All shifts (test)	AUC85.58	40
Error detection	Corruptions (test)	AUC96.26	40
Distribution Shift Detection	BROAD (test)	Novel Classes AUC91.27	40

Showing 10 of 40 rows

Other info

Code

Follow for update

@wizwand_team Discord