Progressive Conditioned Scale-Shift Recalibration of Self-Attention for Online Test-time Adaptation
About
Online test-time adaptation aims to dynamically adjust a network model in real-time based on sequential input samples during the inference stage. In this work, we find that, when applying a transformer network model to a new target domain, the Query, Key, and Value features of its self-attention module often change significantly from those in the source domain, leading to substantial performance degradation of the transformer model. To address this important issue, we propose to develop a new approach to progressively recalibrate the self-attention at each layer using a local linear transform parameterized by conditioned scale and shift factors. We consider the online model adaptation from the source domain to the target domain as a progressive domain shift separation process. At each transformer network layer, we learn a Domain Separation Network to extract the domain shift feature, which is used to predict the scale and shift parameters for self-attention recalibration using a Factor Generator Network. These two lightweight networks are adapted online during inference. Experimental results on benchmark datasets demonstrate that the proposed progressive conditioned scale-shift recalibration (PCSR) method is able to significantly improve the online test-time domain adaptation performance by a large margin of up to 3.9\% in classification accuracy on the ImageNet-C dataset.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | ImageNet-R | Accuracy66.5 | 148 | |
| Image Classification | ImageNet-C level 5 | Avg Top-1 Acc (ImageNet-C L5)70.4 | 61 | |
| Image Classification | ImageNet A | Accuracy52.1 | 50 | |
| Image Classification | ImageNet-C Severity 5 (test) | Error Rate (Gaussian)59.4 | 42 | |
| Image Classification | ImageNet-C level 3 (test) | Acc (Brightness)81.9 | 34 | |
| Image Classification | Visda 2021 | Accuracy64.8 | 14 |