Test-Time Distillation for Continual Model Adaptation
About
Deep neural networks often suffer performance degradation upon deployment due to distribution shifts. Continual Test-Time Adaptation (CTTA) aims to address this issue in an unsupervised manner. However, existing methods that rely on self-supervision are prone to an inherent self-referential feedback loop that amplifies initial prediction errors, leading to model drift. We revisit this limitation and propose Test-Time Distillation (TTD), which reframes adaptation as a distillation process guided by a frozen Vision-Language Model (VLM) as an external signal. While promising, we find that direct distillation is fraught with two pitfalls: (1) the Generalist Trap, where the VLM's broad but non-specialized knowledge leads to suboptimal performance on specific tasks and shifts; and (2) the Entropy Bias, where naive model fusion techniques based on entropy fail due to the disparate calibration of heterogeneous models. These pitfalls highlight the need to build a robust supervisory signal and leverage it to guide the target model toward stable adaptation. Hence, we present CoDiRe, a Continual Distillation and Rectification framework for TTD. CoDiRe first constructs a robust blended teacher by dynamically fusing the predictions of the VLM and the target model. Critically, it circumvents the Entropy Bias by leveraging Maximum Softmax Probability (MSP) as a more reliable confidence metric for weighting each model's expertise. Then it applies an Optimal Transport-based rectification to further align predictions with the blended teacher, enabling continuous and stable adaptation. Extensive experiments show that CoDiRe outperforms state-of-the-art baselines, exceeding CoTTA by 10.55% with only 48% of its time cost on ImageNet-C. Project page is publicly available at https://github.com/walawalagoose/TTD.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-10C Severity Level 5 (test) | -- | 136 | |
| Image Classification | ImageNet-C Severity 5 (test) | Mean Error Rate (Severity 5)43.97 | 132 | |
| Online Continual Test-Time Adaptation | ImageNet-C Severity 5 (test) | Accuracy (Gaussian Noise, ImageNet-C S5)27.57 | 47 | |
| Image Classification | ImageNet-C Severity 5 | Error Rate (Gaussian)54.43 | 43 | |
| Image Classification | CIFAR-100-C Severity 5 | -- | 26 | |
| Online Continual Test-Time Adaptation | CIFAR-10-C severity 5 (test) | Gaussian Noise Accuracy (Severity 5)79.26 | 24 | |
| Test-time adaptation | Office-Home | Accuracy80.38 | 16 | |
| Image Classification | PACS Art | Accuracy (Cartoon Domain)99.73 | 14 | |
| Continual Test-Time Adaptation | ImageNet-C long-term continual adaptation | Average Accuracy60.69 | 10 | |
| Continual Test-Time Adaptation | CIFAR-10-C | Average Accuracy87.25 | 10 |