Boosting Out-of-Distribution Detection with Multiple Pre-trained Models
About
Out-of-Distribution (OOD) detection, i.e., identifying whether an input is sampled from a novel distribution other than the training distribution, is a critical task for safely deploying machine learning systems in the open world. Recently, post hoc detection utilizing pre-trained models has shown promising performance and can be scaled to large-scale problems. This advance raises a natural question: Can we leverage the diversity of multiple pre-trained models to improve the performance of post hoc detection methods? In this work, we propose a detection enhancement method by ensembling multiple detection decisions derived from a zoo of pre-trained models. Our approach uses the p-value instead of the commonly used hard threshold and leverages a fundamental framework of multiple hypothesis testing to control the true positive rate of In-Distribution (ID) data. We focus on the usage of model zoos and provide systematic empirical comparisons with current state-of-the-art methods on various OOD detection benchmarks. The proposed ensemble scheme shows consistent improvement compared to single-model detectors and significantly outperforms the current competitive methods. Our method substantially improves the relative performance by 65.40% and 26.96% on the CIFAR10 and ImageNet benchmarks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Out-of-Distribution Detection | SUN OOD with ImageNet-1k In-distribution (test) | FPR@9548.87 | 159 | |
| Out-of-Distribution Detection | ImageNet OOD Average 1k (test) | FPR@9528.1 | 137 | |
| Out-of-Distribution Detection | CIFAR-10 vs CIFAR-100 | AUROC97.12 | 41 | |
| Out-of-Distribution Detection | CIFAR10 (ID) / ISUN (OOD) (test) | FPR@955.48 | 41 | |
| Out-of-Distribution Detection | CIFAR-10 In-Dist Texture Out-Dist | AUROC99.88 | 41 | |
| Out-of-Distribution Detection | ImageNet (ID) vs Places365 (OOD) 1.0 (test) | FPR9553.96 | 41 | |
| Out-of-Distribution Detection | CIFAR10 ID Place365 OOD (test) | AUROC97.99 | 35 | |
| Out-of-Distribution Detection | CIFAR-10 vs SVHN | AUC0.9943 | 30 | |
| Out-of-Distribution Detection | CIFAR-10 OOD (Averaged Performance) (test) | AUROC99.12 | 28 | |
| Out-of-Distribution Detection | CIFAR10 / LSUN | FPR1.5 | 20 |