Deep Association Learning for Unsupervised Video Person Re-identification
About
Deep learning methods have started to dominate the research progress of video-based person re-identification (re-id). However, existing methods mostly consider supervised learning, which requires exhaustive manual efforts for labelling cross-view pairwise data. Therefore, they severely lack scalability and practicality in real-world video surveillance applications. In this work, to address the video person re-id task, we formulate a novel Deep Association Learning (DAL) scheme, the first end-to-end deep learning method using none of the identity labels in model initialisation and training. DAL learns a deep re-id matching model by jointly optimising two margin-based association losses in an end-to-end manner, which effectively constrains the association of each frame to the best-matched intra-camera representation and cross-camera representation. Existing standard CNNs can be readily employed within our DAL scheme. Experiment results demonstrate that our proposed DAL significantly outperforms current state-of-the-art unsupervised video person re-id methods on three benchmarks: PRID 2011, iLIDS-VID and MARS.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Person Re-Identification | iLIDS-VID | CMC-156.9 | 80 | |
| Video Person Re-ID | iLIDS-VID | Rank-156.9 | 80 | |
| Person Re-Identification | MARS (test) | Rank-149.3 | 72 | |
| Person Re-Identification | MARS | Rank-149.3 | 67 | |
| Person Re-Identification | PRID2011 | Rank-185.3 | 66 | |
| Video Person Re-Identification | MARS (test) | Rank-146.8 | 35 | |
| Video Person Re-Identification | DukeMTMC-VideoReID | Rank-1 Accuracy79.3 | 26 | |
| Video Person Re-Identification | PRID 2011 | Rank-1 Accuracy85.3 | 23 |