Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Temporal Context Network for Activity Localization in Videos

About

We present a Temporal Context Network (TCN) for precise temporal localization of human activities. Similar to the Faster-RCNN architecture, proposals are placed at equal intervals in a video which span multiple temporal scales. We propose a novel representation for ranking these proposals. Since pooling features only inside a segment is not sufficient to predict activity boundaries, we construct a representation which explicitly captures context around a proposal for ranking it. For each temporal segment inside a proposal, features are uniformly sampled at a pair of scales and are input to a temporal convolutional neural network for classification. After ranking proposals, non-maximum suppression is applied and classification is performed to obtain final detections. TCN outperforms state-of-the-art methods on the ActivityNet dataset and the THUMOS14 dataset.

Xiyang Dai, Bharat Singh, Guyue Zhang, Larry S. Davis, Yan Qiu Chen• 2017

Related benchmarks

TaskDatasetResultRank
Temporal Action DetectionTHUMOS-14 (test)
mAP@tIoU=0.525.6
330
Temporal Action LocalizationTHUMOS14 (test)
AP @ IoU=0.525.6
319
Temporal Action LocalizationActivityNet 1.3 (val)
AP@0.537.49
257
Temporal Action DetectionActivityNet v1.3 (val)
mAP@0.536.2
185
Temporal Action ProposalActivityNet v1.3 (val)
AUC59.58
114
Temporal Action DetectionActivityNet 1.3 (test)
Average mAP23.58
80
Action DetectionTHUMOS 2014 (test)
mAP (alpha=0.5)25.6
79
Temporal Action DetectionTHUMOS 14
mAP@0.333.3
71
Temporal Action Proposal GenerationActivityNet 1.3 (test)
AUC61.56
62
Action LocalizationThumos14
mAP@0.525.6
34
Showing 10 of 13 rows

Other info

Follow for update