Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Condensing Action Segmentation Datasets via Generative Network Inversion

About

This work presents the first condensation approach for procedural video datasets used in temporal action segmentation. We propose a condensation framework that leverages generative prior learned from the dataset and network inversion to condense data into compact latent codes with significant storage reduced across temporal and channel aspects. Orthogonally, we propose sampling diverse and representative action sequences to minimize video-wise redundancy. Our evaluation on standard benchmarks demonstrates consistent effectiveness in condensing TAS datasets and achieving competitive performances. Specifically, on the Breakfast dataset, our approach reduces storage by over 500$\times$ while retaining 83% of the performance compared to training with the full dataset. Furthermore, when applied to a downstream incremental learning task, it yields superior performance compared to the state-of-the-art.

Guodong Ding, Rongyu Chen, Angela Yao• 2025

Related benchmarks

TaskDatasetResultRank
Temporal action segmentation50Salads
Accuracy81.2
106
Temporal action segmentationGTEA
F1 Score @ 10% Threshold86.4
99
Temporal action segmentationBreakfast
Accuracy61.1
96
Action SegmentationBreakfast 10 tasks (test)
Acc46.7
16
Showing 4 of 4 rows

Other info

Follow for update