COME: Test-time adaption by Conservatively Minimizing Entropy
About
Machine learning models must continuously self-adjust themselves for novel data distribution in the open world. As the predominant principle, entropy minimization (EM) has been proven to be a simple yet effective cornerstone in existing test-time adaption (TTA) methods. While unfortunately its fatal limitation (i.e., overconfidence) tends to result in model collapse. For this issue, we propose to Conservatively Minimize the Entropy (COME), which is a simple drop-in replacement of traditional EM to elegantly address the limitation. In essence, COME explicitly models the uncertainty by characterizing a Dirichlet prior distribution over model predictions during TTA. By doing so, COME naturally regularizes the model to favor conservative confidence on unreliable samples. Theoretically, we provide a preliminary analysis to reveal the ability of COME in enhancing the optimization stability by introducing a data-adaptive lower bound on the entropy. Empirically, our method achieves state-of-the-art performance on commonly used benchmarks, showing significant improvements in terms of classification accuracy and uncertainty estimation under various settings including standard, life-long and open-world TTA, i.e., up to $34.5\%$ improvement on accuracy and $15.1\%$ on false positive rate.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | CollegeMATH | Accuracy25.42 | 327 | |
| Mathematical Reasoning | AIME 24 | Accuracy6.67 | 318 | |
| Mathematical Reasoning | Minerva | Accuracy (Acc)20.96 | 146 | |
| Image Classification | ImageNet-C Severity 5 (test) | Mean Error Rate (Severity 5)43 | 132 | |
| Image Classification | ImageNet-C | Accuracy59 | 117 | |
| Reasoning | GSM8K | -- | 111 | |
| Image Classification | ImageNet-C level 5 | Avg Top-1 Acc (ImageNet-C L5)58.5 | 110 | |
| Reasoning | MATH 500 | Accuracy (%)48.8 | 94 | |
| Image Classification | ImageNet A | Accuracy31.2 | 73 | |
| Image Classification | ImageNet-Sketch | Accuracy49.2 | 63 |