EmbodiTTA: Resource-Efficient Test-Time Adaptation for Embodied Visual Systems
About
Continual Test-time adaptation (CTTA) continuously adapts the deployed model on every incoming batch of data. While achieving optimal accuracy, existing CTTA approaches present poor real-world applicability on resource-constrained edge devices, due to the substantial memory overhead and energy consumption. In this work, we first introduce a novel paradigm -- on-demand TTA -- which triggers adaptation only when a significant domain shift is detected. Then, we present OD-TTA, an on-demand TTA framework for accurate and efficient adaptation on edge devices. OD-TTA comprises three innovative techniques: 1) a lightweight domain shift detection mechanism to activate TTA only when it is needed, drastically reducing the overall computation overhead, 2) a source domain selection module that chooses an appropriate source model for adaptation, ensuring high and robust accuracy, 3) a decoupled Batch Normalization (BN) update scheme to enable memory-efficient adaptation with small batch sizes. Extensive experiments show that OD-TTA achieves comparable and even better performance while reducing the energy and computation overhead remarkably, making TTA a practical reality.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | ImageNet-C | Accuracy33.3 | 117 | |
| Image Classification | CIFAR10-C | Inference Latency47 | 42 | |
| object recognition | CORe50 indoor-to-outdoor sessions | Accuracy84.5 | 24 | |
| object recognition | CIFAR10-C | Average Accuracy84.9 | 24 | |
| object recognition | ImageNet-C | Average Accuracy40.4 | 24 | |
| Image Classification | ImageNet-C | Memory (MB)624 | 18 | |
| Semantic segmentation | SHIFT | Accuracy (Day->Night)31.1 | 6 |