Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Boosting Few-shot Action Recognition with Graph-guided Hybrid Matching

About

Class prototype construction and matching are core aspects of few-shot action recognition. Previous methods mainly focus on designing spatiotemporal relation modeling modules or complex temporal alignment algorithms. Despite the promising results, they ignored the value of class prototype construction and matching, leading to unsatisfactory performance in recognizing similar categories in every task. In this paper, we propose GgHM, a new framework with Graph-guided Hybrid Matching. Concretely, we learn task-oriented features by the guidance of a graph neural network during class prototype construction, optimizing the intra- and inter-class feature correlation explicitly. Next, we design a hybrid matching strategy, combining frame-level and tuple-level matching to classify videos with multivariate styles. We additionally propose a learnable dense temporal modeling module to enhance the video feature temporal representation to build a more solid foundation for the matching process. GgHM shows consistent improvements over other challenging baselines on several few-shot datasets, demonstrating the effectiveness of our method. The code will be publicly available at https://github.com/jiazheng-xing/GgHM.

Jiazheng Xing, Mengmeng Wang, Yudi Ruan, Bofan Chen, Yaowei Guo, Boyu Mu, Guang Dai, Jingdong Wang, Yong Liu• 2023

Related benchmarks

TaskDatasetResultRank
Action RecognitionKinetics
Accuracy (5-shot)87.4
47
Video Action RecognitionUCF101 5-way 5-shot
Accuracy96.3
28
Video Action RecognitionHMDB51 5-way 5-shot
Accuracy76.9
28
Few-shot Action RecognitionHMDB
Accuracy61.2
21
Few-shot Action RecognitionUCF101 5-way 1-shot
Accuracy85.2
21
Action RecognitionSS Full v2
1-shot Accuracy54.5
21
Action RecognitionHMDB51
Accuracy (1-shot)61.2
16
Action RecognitionUCF101
1-shot Accuracy85.2
16
Showing 8 of 8 rows

Other info

Follow for update