Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Actions as Moving Points

About

The existing action tubelet detectors often depend on heuristic anchor design and placement, which might be computationally expensive and sub-optimal for precise localization. In this paper, we present a conceptually simple, computationally efficient, and more precise action tubelet detection framework, termed as MovingCenter Detector (MOC-detector), by treating an action instance as a trajectory of moving points. Based on the insight that movement information could simplify and assist action tubelet detection, our MOC-detector is composed of three crucial head branches: (1) Center Branch for instance center detection and action recognition, (2) Movement Branch for movement estimation at adjacent frames to form trajectories of moving points, (3) Box Branch for spatial extent detection by directly regressing bounding box size at each estimated center. These three branches work together to generate the tubelet detection results, which could be further linked to yield video-level tubes with a matching strategy. Our MOC-detector outperforms the existing state-of-the-art methods for both metrics of frame-mAP and video-mAP on the JHMDB and UCF101-24 datasets. The performance gap is more evident for higher video IoU, demonstrating that our MOC-detector is particularly effective for more precise action detection. We provide the code at https://github.com/MCG-NJU/MOC-Detector.

Yixuan Li, Zixu Wang, Limin Wang, Gangshan Wu• 2020

Related benchmarks

TaskDatasetResultRank
Action DetectionJHMDB-21
video-mAP@0.577.2
21
Spatio-temporal Action LocalizationUCF101 24
Video-mAP (IoU=0.2)82.8
20
Spatio-temporal Action LocalizationJ-HMDB-21
Video mAP (IoU=0.2)77.3
15
Action DetectionUCF101 24
video-mAP@0.554.4
13
Video Action DetectionUCF101 24
F-mAP@0.578
13
Action DetectionJHMDB (trimmed)
Video-mAP@0.577.2
12
Action DetectionUCF101 24 untrimmed
Video-mAP@0.554.4
10
Action DetectionUCF-101-24 (split 1)
Frame mAP (IoU=0.5)78
10
Action DetectionJ-HMDB
V-Score (IoU 0.5)77.2
10
Spatio-temporal action detectionUCF-24 (test)
F-mAP (IoU=0.5)78
8
Showing 10 of 22 rows

Other info

Code

Follow for update