Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Test-Time Distillation for Continual Model Adaptation

About

Deep neural networks often suffer performance degradation upon deployment due to distribution shifts. Continual Test-Time Adaptation (CTTA) aims to address this issue in an unsupervised manner. However, existing methods that rely on self-supervision are prone to an inherent self-referential feedback loop that amplifies initial prediction errors, leading to model drift. We revisit this limitation and propose Test-Time Distillation (TTD), which reframes adaptation as a distillation process guided by a frozen Vision-Language Model (VLM) as an external signal. While promising, we find that direct distillation is fraught with two pitfalls: (1) the Generalist Trap, where the VLM's broad but non-specialized knowledge leads to suboptimal performance on specific tasks and shifts; and (2) the Entropy Bias, where naive model fusion techniques based on entropy fail due to the disparate calibration of heterogeneous models. These pitfalls highlight the need to build a robust supervisory signal and leverage it to guide the target model toward stable adaptation. Hence, we present CoDiRe, a Continual Distillation and Rectification framework for TTD. CoDiRe first constructs a robust blended teacher by dynamically fusing the predictions of the VLM and the target model. Critically, it circumvents the Entropy Bias by leveraging Maximum Softmax Probability (MSP) as a more reliable confidence metric for weighting each model's expertise. Then it applies an Optimal Transport-based rectification to further align predictions with the blended teacher, enabling continuous and stable adaptation. Extensive experiments show that CoDiRe outperforms state-of-the-art baselines, exceeding CoTTA by 10.55% with only 48% of its time cost on ImageNet-C. Project page is publicly available at https://github.com/walawalagoose/TTD.

Xiao Chen, Jiazhen Huang, Zhiming Liu, Qinting Jiang, Fanding Huang, Jingyan Jiang, Zhi Wang• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-10C Severity Level 5 (test)--
136
Image ClassificationImageNet-C Severity 5 (test)
Mean Error Rate (Severity 5)43.97
132
Online Continual Test-Time AdaptationImageNet-C Severity 5 (test)
Accuracy (Gaussian Noise, ImageNet-C S5)27.57
47
Image ClassificationImageNet-C Severity 5
Error Rate (Gaussian)54.43
43
Image ClassificationCIFAR-100-C Severity 5--
26
Online Continual Test-Time AdaptationCIFAR-10-C severity 5 (test)
Gaussian Noise Accuracy (Severity 5)79.26
24
Test-time adaptationOffice-Home
Accuracy80.38
16
Image ClassificationPACS Art
Accuracy (Cartoon Domain)99.73
14
Continual Test-Time AdaptationImageNet-C long-term continual adaptation
Average Accuracy60.69
10
Continual Test-Time AdaptationCIFAR-10-C
Average Accuracy87.25
10
Showing 10 of 11 rows

Other info

Follow for update