Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Combining Boundary Supervision and Segment-Level Regularization for Fine-Grained Action Segmentation

About

Recent progress in Temporal Action Segmentation (TAS) has increasingly relied on complex architectures, which can hinder practical deployment. We present a lightweight dual-loss training framework that improves fine-grained segmentation quality with only one additional output channel and two auxiliary loss terms, requiring minimal architectural modification. Our approach combines a boundary-regression loss that promotes accurate temporal localization via a single-channel boundary prediction and a CDF-based segment-level regularization loss that encourages coherent within-segment structure by matching cumulative distributions over predicted and ground-truth segments. The framework is architecture-agnostic and can be integrated into existing TAS models (e.g., MS-TCN, C2F-TCN, FACT) as a training-time loss function. Across three benchmark datasets, the proposed method improves segment-level consistency and boundary quality, yielding higher F1 and Edit scores across three different models. Frame-wise accuracy remains largely unchanged, highlighting that precise segmentation can be achieved through simple loss design rather than heavier architectures or inference-time refinements.

Hinako Mitsuoka, Kazuhiro Hotta• 2026

Related benchmarks

TaskDatasetResultRank
Temporal action segmentation50Salads
Accuracy86.44
112
Temporal action segmentationGTEA
F1 Score @ 10% Threshold93.34
105
Temporal action segmentationBreakfast
Accuracy76.08
102
Showing 3 of 3 rows

Other info

Follow for update