Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

COME: Test-time adaption by Conservatively Minimizing Entropy

About

Machine learning models must continuously self-adjust themselves for novel data distribution in the open world. As the predominant principle, entropy minimization (EM) has been proven to be a simple yet effective cornerstone in existing test-time adaption (TTA) methods. While unfortunately its fatal limitation (i.e., overconfidence) tends to result in model collapse. For this issue, we propose to Conservatively Minimize the Entropy (COME), which is a simple drop-in replacement of traditional EM to elegantly address the limitation. In essence, COME explicitly models the uncertainty by characterizing a Dirichlet prior distribution over model predictions during TTA. By doing so, COME naturally regularizes the model to favor conservative confidence on unreliable samples. Theoretically, we provide a preliminary analysis to reveal the ability of COME in enhancing the optimization stability by introducing a data-adaptive lower bound on the entropy. Empirically, our method achieves state-of-the-art performance on commonly used benchmarks, showing significant improvements in terms of classification accuracy and uncertainty estimation under various settings including standard, life-long and open-world TTA, i.e., up to $34.5\%$ improvement on accuracy and $15.1\%$ on false positive rate.

Qingyang Zhang, Yatao Bian, Xinke Kong, Peilin Zhao, Changqing Zhang• 2024

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningCollegeMATH
Accuracy25.42
327
Mathematical ReasoningAIME 24
Accuracy6.67
318
Mathematical ReasoningMinerva
Accuracy (Acc)20.96
146
Image ClassificationImageNet-C Severity 5 (test)
Mean Error Rate (Severity 5)43
132
Image ClassificationImageNet-C
Accuracy59
117
ReasoningGSM8K--
111
Image ClassificationImageNet-C level 5
Avg Top-1 Acc (ImageNet-C L5)58.5
110
ReasoningMATH 500
Accuracy (%)48.8
94
Image ClassificationImageNet A
Accuracy31.2
73
Image ClassificationImageNet-Sketch
Accuracy49.2
63
Showing 10 of 46 rows

Other info

Follow for update