Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Temporal Action Proposal Generation with Transformers

About

Transformer networks are effective at modeling long-range contextual information and have recently demonstrated exemplary performance in the natural language processing domain. Conventionally, the temporal action proposal generation (TAPG) task is divided into two main sub-tasks: boundary prediction and proposal confidence prediction, which rely on the frame-level dependencies and proposal-level relationships separately. To capture the dependencies at different levels of granularity, this paper intuitively presents a unified temporal action proposal generation framework with original Transformers, called TAPG Transformer, which consists of a Boundary Transformer and a Proposal Transformer. Specifically, the Boundary Transformer captures long-term temporal dependencies to predict precise boundary information and the Proposal Transformer learns the rich inter-proposal relationships for reliable confidence evaluation. Extensive experiments are conducted on two popular benchmarks: ActivityNet-1.3 and THUMOS14, and the results demonstrate that TAPG Transformer outperforms state-of-the-art methods. Equipped with the existing action classifier, our method achieves remarkable performance on the temporal action localization task. Codes and models will be available.

Lining Wang, Haosen Yang, Wenhao Wu, Hongxun Yao, Hujie Huang• 2021

Related benchmarks

TaskDatasetResultRank
Temporal Action DetectionTHUMOS-14 (test)
mAP@tIoU=0.544.6
330
Temporal Action DetectionActivityNet 1.3 (test)
Average mAP35.5
80
Showing 2 of 2 rows

Other info

Follow for update