Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Untrimmed Video Classification for Activity Detection: submission to ActivityNet Challenge

About

Current state-of-the-art human activity recognition is focused on the classification of temporally trimmed videos in which only one action occurs per frame. We propose a simple, yet effective, method for the temporal detection of activities in temporally untrimmed videos with the help of untrimmed classification. Firstly, our model predicts the top k labels for each untrimmed video by analysing global video-level features. Secondly, frame-level binary classification is combined with dynamic programming to generate the temporally trimmed activity proposals. Finally, each proposal is assigned a label based on the global label, and scored with the score of the temporal activity proposal and the global score. Ultimately, we show that untrimmed video classification models can be used as stepping stone for temporal detection.

Gurkirt Singh, Fabio Cuzzolin• 2016

Related benchmarks

TaskDatasetResultRank
Temporal Action LocalizationActivityNet 1.3 (val)
AP@0.534.5
257
Temporal Action DetectionActivityNet v1.3 (val)
mAP@0.534.5
185
Temporal Action DetectionActivityNet 1.3 (test)
Average mAP17.83
80
Activity DetectionActivityNet v1.3 (test)
mAP@0.536.4
5
Temporal Action LocalizationActivityNet Challenge 2016 (test)
mAP @ IoU=0.536.4
5
Temporal Action LocalizationActivityNet Challenge 2016 (val)
mAP (IoU=0.5)22.7
4
Showing 6 of 6 rows

Other info

Follow for update