Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Enriching Local and Global Contexts for Temporal Action Localization

About

Effectively tackling the problem of temporal action localization (TAL) necessitates a visual representation that jointly pursues two confounding goals, i.e., fine-grained discrimination for temporal localization and sufficient visual invariance for action classification. We address this challenge by enriching both the local and global contexts in the popular two-stage temporal localization framework, where action proposals are first generated followed by action classification and temporal boundary regression. Our proposed model, dubbed ContextLoc, can be divided into three sub-networks: L-Net, G-Net and P-Net. L-Net enriches the local context via fine-grained modeling of snippet-level features, which is formulated as a query-and-retrieval process. G-Net enriches the global context via higher-level modeling of the video-level representation. In addition, we introduce a novel context adaptation module to adapt the global context to different proposals. P-Net further models the context-aware inter-proposal relations. We explore two existing models to be the P-Net in our experiments. The efficacy of our proposed method is validated by experimental results on the THUMOS14 (54.3\% at tIoU@0.5) and ActivityNet v1.3 (56.01\% at tIoU@0.5) datasets, which outperforms recent states of the art. Code is available at https://github.com/buxiangzhiren/ContextLoc.

Zixin Zhu, Wei Tang, Le Wang, Nanning Zheng, Gang Hua• 2021

Related benchmarks

TaskDatasetResultRank
Temporal Action DetectionTHUMOS-14 (test)
mAP@tIoU=0.554.3
330
Temporal Action LocalizationTHUMOS14 (test)
AP @ IoU=0.554.3
319
Temporal Action LocalizationTHUMOS-14 (test)
mAP@0.368.3
308
Temporal Action LocalizationActivityNet 1.3 (val)
AP@0.556
257
Temporal Action DetectionActivityNet v1.3 (val)
mAP@0.556
185
Temporal Action DetectionActivityNet 1.3
mAP@0.556
93
Temporal Action LocalizationTHUMOS 2014
mAP@0.3068.3
93
Temporal Action DetectionActivityNet 1.3 (test)
Average mAP34.2
80
Temporal Action DetectionTHUMOS 14
mAP@0.368.3
71
Temporal Action LocalizationTHUMOS-14 (test)
mAP@0.368.3
36
Showing 10 of 11 rows

Other info

Follow for update