Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

About

Fine-grained action recognition datasets exhibit environmental bias, where multiple video sequences are captured from a limited number of environments. Training a model in one environment and deploying in another results in a drop in performance due to an unavoidable domain shift. Unsupervised Domain Adaptation (UDA) approaches have frequently utilised adversarial training between the source and target domains. However, these approaches have not explored the multi-modal nature of video within each domain. In this work we exploit the correspondence of modalities as a self-supervised alignment approach for UDA in addition to adversarial alignment. We test our approach on three kitchens from our large-scale dataset, EPIC-Kitchens, using two modalities commonly employed for action recognition: RGB and Optical Flow. We show that multi-modal self-supervision alone improves the performance over source-only training by 2.4% on average. We then combine adversarial training with multi-modal self-supervision, showing that our approach outperforms other UDA methods by 3%.

Jonathan Munro, Dima Damen• 2020

Related benchmarks

TaskDatasetResultRank
Unsupervised Domain AdaptationUCF-HMDB
Accuracy (U -> H)84.2
52
Action RecognitionEpic-Kitchens
Average Comparison Score43.9
47
Open-set Unsupervised Video Domain AdaptationEpic-Kitchens
Average Performance56.49
31
Action RecognitionEPIC-KITCHENS (test)
Average Score56.25
25
Multi-source Domain GeneralizationHAC
Mean Accuracy66.74
24
Video Moment RetrievalC → T cross-domain (Source: C, Target: T) (test)
R@1 (IoU=0.3)31.5
7
Video Unsupervised Domain AdaptationActorShift
Transfer Score KT to C111
7
Video Moment RetrievalA → C cross-domain (target domain evaluation)
R@1 (IoU=0.3)67.56
7
Video Moment RetrievalCross-domain C → A (target evaluation)
R@1 (IoU=0.3)67.3
7
Video Moment RetrievalT → C cross-domain (Source: T, Target: C) (target domain evaluation)
R@1 (IoU=0.3)44.81
7
Showing 10 of 13 rows

Other info

Follow for update