Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Adapting in the Dark: Efficient and Stable Test-Time Adaptation for Black-Box Models

About

Test-Time Adaptation (TTA) for black-box models accessible only via APIs remains a largely unexplored challenge. Existing approaches such as post-hoc output refinement offer limited adaptive capacity, while Zeroth-Order Optimization (ZOO) enables input-space adaptation but faces high query costs and optimization challenges in the unsupervised TTA setting. We introduce BETA (Black-box Efficient Test-time Adaptation), a framework that addresses these limitations by employing a lightweight, local white-box steering model to create a tractable gradient pathway. Through a prediction harmonization technique combined with consistency regularization and prompt learning-oriented filtering, BETA enables stable adaptation with no additional API calls and negligible latency beyond standard inference. On ImageNet-C, BETA achieves a +7.1% accuracy gain on ViT-B/16 and +3.4% on CLIP, surpassing strong white-box and gray-box methods including TENT and TPT. On a commercial API, BETA achieves comparable performance to ZOO at 250x lower cost while maintaining real-time inference speed, establishing it as a practical and efficient solution for real-world black-box TTA.

Yunbei Zhang, Shuaicheng Niu, Chengyi Cai, Feng Liu, Jihun Hamm• 2026

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet-R (test)
Accuracy76
170
Image ClassificationImageNet-Sketch (test)--
153
Image ClassificationImageNet-C Severity 5 (test)
Mean Error Rate (Severity 5)62.6
132
Image ClassificationImageNet A, R, S V2 (test)
Accuracy (ImageNet-A)62.8
42
Image ClassificationImageNet-C
Gauss Error59
36
Skin lesion classificationDerm7pt--
15
Image ClassificationEuroSAT
Accuracy (%)53.3
5
Image ClassificationImageNet-C all 15 corruptions severity 5
Average Accuracy54.7
3
Showing 8 of 8 rows

Other info

Follow for update