Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment

About

In this work, we address the task of weakly-supervised human action segmentation in long, untrimmed videos. Recent methods have relied on expensive learning models, such as Recurrent Neural Networks (RNN) and Hidden Markov Models (HMM). However, these methods suffer from expensive computational cost, thus are unable to be deployed in large scale. To overcome the limitations, the keys to our design are efficiency and scalability. We propose a novel action modeling framework, which consists of a new temporal convolutional network, named Temporal Convolutional Feature Pyramid Network (TCFPN), for predicting frame-wise action labels, and a novel training strategy for weakly-supervised sequence modeling, named Iterative Soft Boundary Assignment (ISBA), to align action sequences and update the network in an iterative fashion. The proposed framework is evaluated on two benchmark datasets, Breakfast and Hollywood Extended, with four different evaluation metrics. Extensive experimental results show that our methods achieve competitive or superior performance to state-of-the-art methods.

Li Ding, Chenliang Xu• 2018

Related benchmarks

TaskDatasetResultRank
Temporal Action LocalizationTHUMOS14 (test)
AP @ IoU=0.522.8
319
Temporal Action LocalizationActivityNet 1.2 (val)
mAP@IoU 0.537
110
Action SegmentationBreakfast--
107
Temporal action segmentationBreakfast
Accuracy52
96
Action SegmentationBreakfast
MoF38.4
66
Action SegmentationBreakfast (test)
MoF60.6
31
Action SegmentationCOIN
Frame Accuracy34.3
29
Action SegmentationBreakfast 14
MoF52
26
Action SegmentationCOIN (test)
Frame Accuracy34.3
23
Action SegmentationBreakfast Action dataset
MoF60.6
22
Showing 10 of 31 rows

Other info

Follow for update