Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition

About

Humans can easily recognize actions with only a few examples given, while the existing video recognition models still heavily rely on the large-scale labeled data inputs. This observation has motivated an increasing interest in few-shot video action recognition, which aims at learning new actions with only very few labeled samples. In this paper, we propose a depth guided Adaptive Meta-Fusion Network for few-shot video recognition which is termed as AMeFu-Net. Concretely, we tackle the few-shot recognition problem from three aspects: firstly, we alleviate this extremely data-scarce problem by introducing depth information as a carrier of the scene, which will bring extra visual information to our model; secondly, we fuse the representation of original RGB clips with multiple non-strictly corresponding depth clips sampled by our temporal asynchronization augmentation mechanism, which synthesizes new instances at feature-level; thirdly, a novel Depth Guided Adaptive Instance Normalization (DGAdaIN) fusion module is proposed to fuse the two-stream modalities efficiently. Additionally, to better mimic the few-shot recognition process, our model is trained in the meta-learning way. Extensive experiments on several action recognition benchmarks demonstrate the effectiveness of our model.

Yuqian Fu, Li Zhang, Junke Wang, Yanwei Fu, Yu-Gang Jiang• 2020

Related benchmarks

TaskDatasetResultRank
Action RecognitionKinetics
Accuracy (5-shot)86.8
98
Video RecognitionHMDB51
Accuracy75.5
89
Action RecognitionUCF101
5-shot Accuracy95.5
48
Video RecognitionKinetics (test)
Accuracy86.8
42
Video Action RecognitionUCF101 5-way 5-shot
Accuracy95.5
28
Video Action RecognitionHMDB51 5-way 5-shot
Accuracy75.5
28
Few-shot Action RecognitionKinetics 5-shot
Accuracy86.8
27
Few-shot Action RecognitionUCF101 5-shot
Accuracy95.5
27
Few-shot Action RecognitionHMDB51 5-shot
Accuracy75.5
27
Action RecognitionHMDB51
5-shot Accuracy75.5
25
Showing 10 of 18 rows

Other info

Follow for update