Counting Out Time: Class Agnostic Video Repetition Counting in the Wild

About

We present an approach for estimating the period with which an action is repeated in a video. The crux of the approach lies in constraining the period prediction module to use temporal self-similarity as an intermediate representation bottleneck that allows generalization to unseen repetitions in videos in the wild. We train this model, called Repnet, with a synthetic dataset that is generated from a large unlabeled video collection by sampling short clips of varying lengths and repeating them with different periods and counts. This combination of synthetic data and a powerful yet constrained model, allows us to predict periods in a class-agnostic fashion. Our model substantially exceeds the state of the art performance on existing periodicity (PERTUBE) and repetition counting (QUVA) benchmarks. We also collect a new challenging dataset called Countix (~90 times larger than existing datasets) which captures the challenges of repetition counting in real-world videos. Project webpage: https://sites.google.com/view/repnet .

Debidatta Dwibedi, Yusuf Aytar, Jonathan Tompson, Pierre Sermanet, Andrew Zisserman• 2020

Related benchmarks

Task	Dataset	Result
Video Repetition Counting	UCFRep (test)	MAE99.8	32
Repetitive Action Counting	RepCount (test)	MAE0.995	9
Repetitive Action Counting	RepCount-A Regular Setting (test)	MAE0.995	9
Repetitive Action Counting	UCFRep-pose (test)	MAE98.1	8
Step Counting	PD-FoG Moderate H&Y	MAE61.8	8
Repetitive Action Counting	Countix (test)	MAE0.36	8
Step Counting	PD-FoG H&Y (Mild)	MAE62.2	8
Step Counting	PD-FoG H&Y (All)	MAE62.2	8
Repetitive Action Counting	RepCount-pose (test)	MAE0.995	8
Visual Repetition Counting	RepCount benchmark split	MAE0.013	7

Showing 10 of 23 rows

Other info

Code

Follow for update

@wizwand_team Discord