Order-Aware Test-Time Adaptation: Leveraging Temporal Dynamics for Robust Streaming Inference
About
Test-Time Adaptation (TTA) enables pre-trained models to adjust to distribution shift by learning from unlabeled test-time streams. However, existing methods typically treat these streams as independent samples, overlooking the supervisory signal inherent in temporal dynamics. To address this, we introduce Order-Aware Test-Time Adaptation (OATTA). We formulate test-time adaptation as a gradient-free recursive Bayesian estimation task, using a learned dynamic transition matrix as a temporal prior to refine the base model's predictions. To ensure safety in weakly structured streams, we introduce a likelihood-ratio gate (LLR) that reverts to the base predictor when temporal evidence is absent. OATTA is a lightweight, model-agnostic module that incurs negligible computational overhead. Extensive experiments across image classification, wearable and physiological signal analysis, and language sentiment analysis demonstrate its universality; OATTA consistently boosts established baselines, improving accuracy by up to 6.35%. Our findings establish that modeling temporal dynamics provides a critical, orthogonal signal beyond standard order-agnostic TTA approaches.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Human Activity Recognition | USCHAD (held-out) | Accuracy58.5 | 20 | |
| Image Classification | Caltech Camera Traps (CCT) (test) | Accuracy61.84 | 14 | |
| Sentiment Classification | Sentiment140 (test) | Accuracy87.35 | 12 | |
| Human Activity Recognition | USC-HAD source-target pairs | Transfer Accuracy (S01 -> S03)61.03 | 12 | |
| Wearable Human Activity Recognition | UCI-HAR S05 → S29 (cross-subject test-time adaptation) | MF1 Score72.87 | 12 | |
| Wearable Human Activity Recognition | UCI-HAR S07 → S27 cross-subject test-time adaptation | MF1 Score76.51 | 12 | |
| Wearable Human Activity Recognition | UCI-HAR S08 → S21 cross-subject test-time adaptation | MF1 Score78.78 | 12 | |
| Wearable Human Activity Recognition | UCI-HAR S08 → S23 cross-subject test-time adaptation | MF1 Score76.29 | 12 | |
| Wearable Human Activity Recognition | UCI-HAR S05 → S18 cross-subject test-time adaptation | MF1-score53.38 | 12 | |
| Wearable Human Activity Recognition | UCI-HAR S07 → S19 cross-subject test-time adaptation | MF1 Score48.58 | 12 |