Gradient Short-Circuit: Efficient Out-of-Distribution Detection via Feature Intervention
About
Out-of-Distribution (OOD) detection is critical for safely deploying deep models in open-world environments, where inputs may lie outside the training distribution. During inference on a model trained exclusively with In-Distribution (ID) data, we observe a salient gradient phenomenon: around an ID sample, the local gradient directions for "enhancing" that sample's predicted class remain relatively consistent, whereas OOD samples--unseen in training--exhibit disorganized or conflicting gradient directions in the same neighborhood. Motivated by this observation, we propose an inference-stage technique to short-circuit those feature coordinates that spurious gradients exploit to inflate OOD confidence, while leaving ID classification largely intact. To circumvent the expense of recomputing the logits after this gradient short-circuit, we further introduce a local first-order approximation that accurately captures the post-modification outputs without a second forward pass. Experiments on standard OOD benchmarks show our approach yields substantial improvements. Moreover, the method is lightweight and requires minimal changes to the standard inference pipeline, offering a practical path toward robust OOD detection in real-world applications.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| OOD Detection | CIFAR-10 standard (test) | -- | 17 | |
| OOD Detection | CIFAR-10 Overall (Combined Shift Regimes) Final aggregate (test) | AUROC60.78 | 7 | |
| OOD Detection | CIFAR-10 Adversarial FGSM, PGD, and AutoAttack averages (test) | AUROC55.93 | 7 | |
| OOD Detection | CIFAR-10-C Corruption average (test) | AUROC60.47 | 7 | |
| OOD Detection | OOD-benchmarks SVHN, LSUN, iSUN, Textures, Places365 (test) | AUROC0.7429 | 7 |