Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Protected Test-Time Adaptation via Online Entropy Matching: A Betting Approach

About

We present a novel approach for test-time adaptation via online self-training, consisting of two components. First, we introduce a statistical framework that detects distribution shifts in the classifier's entropy values obtained on a stream of unlabeled samples. Second, we devise an online adaptation mechanism that utilizes the evidence of distribution shifts captured by the detection tool to dynamically update the classifier's parameters. The resulting adaptation process drives the distribution of test entropy values obtained from the self-trained classifier to match those of the source domain, building invariance to distribution shifts. This approach departs from the conventional self-training method, which focuses on minimizing the classifier's entropy. Our approach combines concepts in betting martingales and online learning to form a detection tool capable of quickly reacting to distribution shifts. We then reveal a tight relation between our adaptation scheme and optimal transport, which forms the basis of our novel self-supervised loss. Experimental results demonstrate that our approach improves test-time accuracy under distribution shifts while maintaining accuracy and calibration in their absence, outperforming leading entropy minimization methods across various scenarios.

Yarin Bar, Shalev Shaer, Yaniv Romano• 2024

Related benchmarks

TaskDatasetResultRank
Reading ComprehensionC3
Accuracy53.41
73
Aspect-level Sentiment AnalysisCOTE BD
F1 Score89.67
34
Sentiment AnalysisChnSent
Accuracy94.41
17
Text ClassificationTNEWS
Accuracy54.72
17
Aspect-based Sentiment AnalysisCOTE-MFW syntactically perturbed
F1 Score85.7
17
Relation ExtractionFinRE
F1 Score73.18
17
Sentiment AnalysisAmazon syntactically perturbed
Accuracy59.63
17
Question AnsweringCMRC syntactically perturbed 2018
F1 Score76.24
17
Reading ComprehensionDRCD
F1 Score83.39
17
Reading ComprehensionSanWen syntactically perturbed
F1 Score87.52
17
Showing 10 of 17 rows

Other info

Follow for update