Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Learning Latent Super-Events to Detect Multiple Activities in Videos

About

In this paper, we introduce the concept of learning latent super-events from activity videos, and present how it benefits activity detection in continuous videos. We define a super-event as a set of multiple events occurring together in videos with a particular temporal organization; it is the opposite concept of sub-events. Real-world videos contain multiple activities and are rarely segmented (e.g., surveillance videos), and learning latent super-events allows the model to capture how the events are temporally related in videos. We design temporal structure filters that enable the model to focus on particular sub-intervals of the videos, and use them together with a soft attention mechanism to learn representations of latent super-events. Super-event representations are combined with per-frame or per-segment CNNs to provide frame-level annotations. Our approach is designed to be fully differentiable, enabling end-to-end learning of latent super-event representations jointly with the activity detector using them. Our experiments with multiple public video datasets confirm that the proposed concept of latent super-event learning significantly benefits activity detection, advancing the state-of-the-arts.

AJ Piergiovanni, Michael S. Ryoo• 2017

Related benchmarks

TaskDatasetResultRank
Activity DetectionCharades localize v1
mAP25.2
52
Activity DetectionMLB-YouTube (test)
mAP39.6
51
Temporal Action LocalizationMultiTHUMOS
f-mAP36.4
20
Activity DetectionMultiTHUMOS
mAP36.4
16
Action DetectionMultiTHUMOS--
16
Action Recognition (Dense Labeling)MultiTHUMOS (test)
mAP36.4
15
Temporal Activity DetectionCharades v1_localize (val)
mAP19.41
15
Multi-label Temporal Action LocalizationCharades per-frame 51
mAP19.41
14
Multi-label Temporal Action SegmentationCharades 1.0 (test)
Seg-mAP18.6
14
Temporal Activity DetectionMultiTHUMOS 2018 (test)
mAP46.4
12
Showing 10 of 20 rows

Other info

Code

Follow for update