Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Video Test-Time Adaptation for Action Recognition

About

Although action recognition systems can achieve top performance when evaluated on in-distribution test points, they are vulnerable to unanticipated distribution shifts in test data. However, test-time adaptation of video action recognition models against common distribution shifts has so far not been demonstrated. We propose to address this problem with an approach tailored to spatio-temporal models that is capable of adaptation on a single video sample at a step. It consists in a feature distribution alignment technique that aligns online estimates of test set statistics towards the training statistics. We further enforce prediction consistency over temporally augmented views of the same test video sample. Evaluations on three benchmark action recognition datasets show that our proposed technique is architecture-agnostic and able to significantly boost the performance on both, the state of the art convolutional architecture TANet and the Video Swin Transformer. Our proposed method demonstrates a substantial performance gain over existing test-time adaptation approaches in both evaluations of a single distribution shift and the challenging case of random distribution shifts. Code will be available at \url{https://github.com/wlin-at/ViTTA}.

Wei Lin, Muhammad Jehanzeb Mirza, Mateusz Kozinski, Horst Possegger, Hilde Kuehne, Horst Bischof• 2022

Related benchmarks

TaskDatasetResultRank
Action RecognitionSomething-Something v2 (val)
Top-1 Accuracy49.66
535
Action RecognitionUCF101 (val)
Accuracy84.74
42
Video Action ClassificationUCF101 time-correlated (val)
Mean Top-1 Acc83.92
21
Video Action ClassificationSSv2 time-correlated (val)
Top-1 Accuracy48.25
21
Video Action ClassificationK400 time-correlated (val)
Top-1 Accuracy53.97
21
Video ClassificationKinetics-400
Average Accuracy (Corruption Robustness)48.94
15
Action RecognitionUCF101 random distribution shifts (test)
Top-1 Acc83.11
12
Action RecognitionSSv2 random distribution shifts (test)
Top-1 Accuracy46.32
12
Action RecognitionK400 random distribution shifts (test)
Mean Top-1 Accuracy49.67
12
Action RecognitionUCF101
Acc (Gauss)71.37
7
Showing 10 of 14 rows

Other info

Code

Follow for update