Test-Time Distillation for Continual Model Adaptation

About

Deep neural networks often suffer performance degradation upon deployment due to distribution shifts. Continual Test-Time Adaptation (CTTA) aims to address this issue in an unsupervised manner. However, existing methods that rely on self-supervision are prone to an inherent self-referential feedback loop that amplifies initial prediction errors, leading to model drift. We revisit this limitation and propose Test-Time Distillation (TTD), which reframes adaptation as a distillation process guided by a frozen Vision-Language Model (VLM) as an external signal. While promising, we find that direct distillation is fraught with two pitfalls: (1) the Generalist Trap, where the VLM's broad but non-specialized knowledge leads to suboptimal performance on specific tasks and shifts; and (2) the Entropy Bias, where naive model fusion techniques based on entropy fail due to the disparate calibration of heterogeneous models. These pitfalls highlight the need to build a robust supervisory signal and leverage it to guide the target model toward stable adaptation. Hence, we present CoDiRe, a Continual Distillation and Rectification framework for TTD. CoDiRe first constructs a robust blended teacher by dynamically fusing the predictions of the VLM and the target model. Critically, it circumvents the Entropy Bias by leveraging Maximum Softmax Probability (MSP) as a more reliable confidence metric for weighting each model's expertise. Then it applies an Optimal Transport-based rectification to further align predictions with the blended teacher, enabling continuous and stable adaptation. Extensive experiments show that CoDiRe outperforms state-of-the-art baselines, exceeding CoTTA by 10.55% with only 48% of its time cost on ImageNet-C. Project page is publicly available at https://github.com/walawalagoose/TTD.

Xiao Chen, Jiazhen Huang, Zhiming Liu, Qinting Jiang, Fanding Huang, Jingyan Jiang, Zhi Wang• 2025

Related benchmarks

Task	Dataset	Result
Image Classification	ImageNet-C Severity 5 (test)	Mean Error Rate (Severity 5)43.97	216
Image Classification	CIFAR-10C Severity Level 5 (test)	--	136
Online Continual Test-Time Adaptation	ImageNet-C Severity 5 (test)	Accuracy (Gaussian Noise, ImageNet-C S5)27.57	47
Image Classification	ImageNet-C Severity 5	Error Rate (Gaussian)54.43	43
Image Classification	CIFAR-100-C Severity 5	--	26
Online Continual Test-Time Adaptation	CIFAR-10-C severity 5 (test)	Gaussian Noise Accuracy (Severity 5)79.26	24
Test-time adaptation	Office-Home	Accuracy80.38	16
Image Classification	PACS Art	Accuracy (Cartoon Domain)99.73	14
Continual Test-Time Adaptation	ImageNet-C long-term continual adaptation	Average Accuracy60.69	10
Continual Test-Time Adaptation	CIFAR-10-C	Average Accuracy87.25	10

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord