Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Unsupervised Learning and Segmentation of Complex Activities from Video

About

This paper presents a new method for unsupervised segmentation of complex activities from video into multiple steps, or sub-activities, without any textual input. We propose an iterative discriminative-generative approach which alternates between discriminatively learning the appearance of sub-activities from the videos' visual features to sub-activity labels and generatively modelling the temporal structure of sub-activities using a Generalized Mallows Model. In addition, we introduce a model for background to account for frames unrelated to the actual activities. Our approach is validated on the challenging Breakfast Actions and Inria Instructional Videos datasets and outperforms both unsupervised and weakly-supervised state of the art.

Fadime Sener, Angela Yao• 2018

Related benchmarks

TaskDatasetResultRank
Action SegmentationBreakfast
MoF34.6
66
Action SegmentationBreakfast (test)
MoF34.6
31
Action SegmentationBreakfast 14
MoF34.6
26
Action SegmentationBreakfast Action dataset
MoF34.6
22
Action SegmentationYouTube Instructions (test)
F1 Score (%)27
17
Action SegmentationYouTube Instructions
F127
16
Temporal Video SegmentationBreakfast
MoF0.346
14
Temporal action segmentationYouTube Instructional YTI (test)
F1 Score27
11
Video segmentationINRIA Instructional Videos
F1 Score69.2
10
Unsupervised Temporal Action SegmentationBreakfast
MOF34.6
10
Showing 10 of 13 rows

Other info

Follow for update