Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Every Shot Counts: Using Exemplars for Repetition Counting in Videos

About

Video repetition counting infers the number of repetitions of recurring actions or motion within a video. We propose an exemplar-based approach that discovers visual correspondence of video exemplars across repetitions within target videos. Our proposed Every Shot Counts (ESCounts) model is an attention-based encoder-decoder that encodes videos of varying lengths alongside exemplars from the same and different videos. In training, ESCounts regresses locations of high correspondence to the exemplars within the video. In tandem, our method learns a latent that encodes representations of general repetitive motions, which we use for exemplar-free, zero-shot inference. Extensive experiments over commonly used datasets (RepCount, Countix, and UCFRep) showcase ESCounts obtaining state-of-the-art performance across all three datasets. Detailed ablations further demonstrate the effectiveness of our method.

Saptarshi Sinha, Alexandros Stergiou, Dima Damen• 2024

Related benchmarks

TaskDatasetResultRank
Video Repetition CountingUCFRep (test)
MAE21.6
32
Repetitive Action CountingRepCount (test)
MAE0.213
9
Repetitive Action CountingCountix (test)
MAE0.276
8
Visual Repetition CountingRepCount benchmark split
MAE0.213
7
Video Repetition CountingCountix (test)
MAE0.374
5
Repetition CountingMo-RepCount
OBO0.397
5
Video Repetition CountingCountix
MAE0.276
4
Egocentric Repetitive Action CountingOVR-Ego4D (test)
RMSE2.41
3
Visual Repetition CountingRepCount open set
MAE0.436
2
Showing 9 of 9 rows

Other info

Code

Follow for update