Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DyFADet: Dynamic Feature Aggregation for Temporal Action Detection

About

Recent proposed neural network-based Temporal Action Detection (TAD) models are inherently limited to extracting the discriminative representations and modeling action instances with various lengths from complex scenes by shared-weights detection heads. Inspired by the successes in dynamic neural networks, in this paper, we build a novel dynamic feature aggregation (DFA) module that can simultaneously adapt kernel weights and receptive fields at different timestamps. Based on DFA, the proposed dynamic encoder layer aggregates the temporal features within the action time ranges and guarantees the discriminability of the extracted representations. Moreover, using DFA helps to develop a Dynamic TAD head (DyHead), which adaptively aggregates the multi-scale features with adjusted parameters and learned receptive fields better to detect the action instances with diverse ranges from videos. With the proposed encoder layer and DyHead, a new dynamic TAD model, DyFADet, achieves promising performance on a series of challenging TAD benchmarks, including HACS-Segment, THUMOS14, ActivityNet-1.3, Epic-Kitchen 100, Ego4D-Moment QueriesV1.0, and FineAction. Code is released to https://github.com/yangle15/DyFADet-pytorch.

Le Yang, Ziwei Zheng, Yizeng Han, Hao Cheng, Shiji Song, Gao Huang, Fan Li• 2024

Related benchmarks

TaskDatasetResultRank
Temporal Action DetectionTHUMOS-14 (test)
mAP@tIoU=0.572.7
330
Temporal Action LocalizationTHUMOS14 (test)
AP @ IoU=0.576.3
319
Temporal Action LocalizationActivityNet 1.3 (val)
AP@0.558.1
257
Temporal Action DetectionActivityNet 1.3
mAP@0.558.1
93
Temporal Action DetectionActivityNet 1.3 (test)
Average mAP38.5
80
Temporal Action DetectionHACS segment (test)
mAP@0.564
30
Temporal Action DetectionFineAction
Avg mAP23.8
27
Temporal Action DetectionEgo4D MQ 1.0 (test)
AP @ IoU 0.128.8
8
Temporal Action DetectionEpic-Kitchen Verb 100
mAP @ IoU=0.128
4
Temporal Action DetectionEpic-Kitchen Noun 100
AP @ IoU=0.126.8
4
Showing 10 of 10 rows

Other info

Code

Follow for update