Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Segmental Spatiotemporal CNNs for Fine-grained Action Segmentation

About

Joint segmentation and classification of fine-grained actions is important for applications of human-robot interaction, video surveillance, and human skill evaluation. However, despite substantial recent progress in large-scale action classification, the performance of state-of-the-art fine-grained action recognition approaches remains low. We propose a model for action segmentation which combines low-level spatiotemporal features with a high-level segmental classifier. Our spatiotemporal CNN is comprised of a spatial component that uses convolutional filters to capture information about objects and their relationships, and a temporal component that uses large 1D convolutional filters to capture information about how object relationships change across time. These features are used in tandem with a semi-Markov model that models transitions from one action to another. We introduce an efficient constrained segmental inference algorithm for this model that is orders of magnitude faster than the current approach. We highlight the effectiveness of our Segmental Spatiotemporal CNN on cooking and surgical action datasets for which we observe substantially improved performance relative to recent baseline methods.

Colin Lea, Austin Reiter, Rene Vidal, Gregory D. Hager• 2016

Related benchmarks

TaskDatasetResultRank
Action Segmentation50Salads
Edit Distance24.8
114
Temporal action segmentation50Salads
Accuracy59.4
106
Temporal action segmentationGTEA
F1 Score @ 10% Threshold58.7
99
Action SegmentationGTEA
F1@10%58.7
39
Generic Event Boundary DetectionKinetics-GEBD (val)
F1 Score @ Threshold 0.0558.8
37
Temporal action segmentation50 Salads granularity (Eval)
MoF72
24
Action Segmentation50Salads mid granularity
MoF58.1
19
Action SegmentationJIGSAWS
Accuracy77.7
19
Generic Event Boundary DetectionTAPOS (val)
F1 Score @ 0.0523.7
18
Action Segmentation50 Salads Mid
Accuracy59.4
17
Showing 10 of 20 rows

Other info

Follow for update