Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Jointly Attentive Spatial-Temporal Pooling Networks for Video-based Person Re-Identification

About

Person Re-Identification (person re-id) is a crucial task as its applications in visual surveillance and human-computer interaction. In this work, we present a novel joint Spatial and Temporal Attention Pooling Network (ASTPN) for video-based person re-identification, which enables the feature extractor to be aware of the current input video sequences, in a way that interdependency from the matching items can directly influence the computation of each other's representation. Specifically, the spatial pooling layer is able to select regions from each frame, while the attention temporal pooling performed can select informative frames over the sequence, both pooling guided by the information from distance matching. Experiments are conduced on the iLIDS-VID, PRID-2011 and MARS datasets and the results demonstrate that this approach outperforms existing state-of-art methods. We also analyze how the joint pooling in both dimensions can boost the person re-id performance more effectively than using either of them separately.

Shuangjie Xu, Yu Cheng, Kang Gu, Yang Yang, Shiyu Chang, Pan Zhou• 2017

Related benchmarks

TaskDatasetResultRank
Video Person Re-IDMARS
Rank-1 Acc44
106
Person Re-IdentificationiLIDS-VID
CMC-162
80
Video Person Re-IDiLIDS-VID
Rank-162
80
Person Re-IdentificationMARS (test)
Rank-144
72
Person Re-IdentificationMARS
Rank-144
67
Person Re-IdentificationPRID2011
Rank-177
66
Person Re-IdentificationPRID 2011 (test)
Rank-177
48
Video Person Re-IdentificationMARS (test)
Rank-144
35
Video Person Re-IdentificationiLIDS-VID (test)
Rank-162
25
Video Person Re-IdentificationPRID 2011
Rank-1 Accuracy77
23
Showing 10 of 10 rows

Other info

Follow for update