Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions
About
Human intelligence can retrieve any person according to both visual and language descriptions. However, the current computer vision community studies specific person re-identification (ReID) tasks in different scenarios separately, which limits the applications in the real world. This paper strives to resolve this problem by proposing a new instruct-ReID task that requires the model to retrieve images according to the given image or language instructions. Our instruct-ReID is a more general ReID setting, where existing 6 ReID tasks can be viewed as special cases by designing different instructions. We propose a large-scale OmniReID benchmark and an adaptive triplet loss as a baseline method to facilitate research in this new setting. Experimental results show that the proposed multi-purpose ReID model, trained on our OmniReID benchmark without fine-tuning, can improve +0.5%, +0.6%, +7.7% mAP on Market1501, MSMT17, CUHK03 for traditional ReID, +6.4%, +7.1%, +11.2% mAP on PRCC, VC-Clothes, LTCC for clothes-changing ReID, +11.7% mAP on COCAS+ real2 for clothes template based clothes-changing ReID when using only RGB images, +24.9% mAP on COCAS+ real2 for our newly defined language-instructed ReID, +4.3% on LLCM for visible-infrared ReID, +2.6% on CUHK-PEDES for text-to-image ReID. The datasets, the model, and code will be available at https://github.com/hwz-zju/Instruct-ReID.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Person Re-Identification | Market1501 (test) | Rank-1 Accuracy96.5 | 1264 | |
| Person Re-Identification | MSMT17 (test) | Rank-1 Acc86.9 | 499 | |
| Text-to-image Person Re-identification | CUHK-PEDES (test) | Rank-1 Accuracy (R-1)74.2 | 150 | |
| Person Re-Identification | CUHK03 (test) | Rank-1 Accuracy86.5 | 108 | |
| Person Re-Identification | LTCC General | -- | 82 | |
| Person Re-Identification | LTCC cloth-changing | Rank-175.8 | 60 | |
| Person Re-Identification | PRCC (CC) | Top-1 Acc54.2 | 50 | |
| Person Re-Identification | LTCC CC protocol (test) | R-1 Accuracy46.7 | 27 | |
| Person Re-Identification | PRCC | Rank1 Acc54.2 | 15 | |
| Cross-modality Person Re-identification | LLCM Visible to Infrared (test) | Rank-1 Acc66.7 | 11 |