Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Timestamp-Supervised Action Segmentation from the Perspective of Clustering

About

Video action segmentation under timestamp supervision has recently received much attention due to lower annotation costs. Most existing methods generate pseudo-labels for all frames in each video to train the segmentation model. However, these methods suffer from incorrect pseudo-labels, especially for the semantically unclear frames in the transition region between two consecutive actions, which we call ambiguous intervals. To address this issue, we propose a novel framework from the perspective of clustering, which includes the following two parts. First, pseudo-label ensembling generates incomplete but high-quality pseudo-label sequences, where the frames in ambiguous intervals have no pseudo-labels. Second, iterative clustering iteratively propagates the pseudo-labels to the ambiguous intervals by clustering, and thus updates the pseudo-label sequences to train the model. We further introduce a clustering loss, which encourages the features of frames within the same action segment more compact. Extensive experiments show the effectiveness of our method.

Dazhao Du, Enhan Li, Lingyu Si, Fanjiang Xu, Fuchun Sun• 2022

Related benchmarks

TaskDatasetResultRank
Skeleton-based Temporal Action SegmentationPKU-MMD (X-sub)
Accuracy58.3
35
Temporal action segmentationMCFS-130
Accuracy57.6
29
Skeleton-based Temporal Action SegmentationPKU-MMD (X-view)
Accuracy62.6
21
Temporal action segmentationMCFS 22
Accuracy67.1
17
Showing 4 of 4 rows

Other info

Follow for update