ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language

About

Person search by natural language aims at retrieving a specific person in a large-scale image pool that matches the given textual descriptions. While most of the current methods treat the task as a holistic visual and textual feature matching one, we approach it from an attribute-aligning perspective that allows grounding specific attribute phrases to the corresponding visual regions. We achieve success as well as the performance boosting by a robust feature learning that the referred identity can be accurately bundled by multiple attribute visual cues. To be concrete, our Visual-Textual Attribute Alignment model (dubbed as ViTAA) learns to disentangle the feature space of a person into subspaces corresponding to attributes using a light auxiliary attribute segmentation computing branch. It then aligns these visual features with the textual attributes parsed from the sentences by using a novel contrastive learning loss. Upon that, we validate our ViTAA framework through extensive experiments on tasks of person search by natural language and by attribute-phrase queries, on which our system achieves state-of-the-art performances. Code will be publicly available upon publication.

Zhe Wang, Zhiyuan Fang, Jun Wang, Yezhou Yang• 2020

Related benchmarks

Task	Dataset	Result
Text-based Person Search	CUHK-PEDES (test)	Rank-155.97	171
Text-to-image Person Re-identification	CUHK-PEDES (test)	Rank-1 Accuracy (R-1)55.97	150
Text-to-Image Retrieval	CUHK-PEDES (test)	Recall@155.97	114
Text-based Person Search	ICFG-PEDES (test)	R@150.98	109
Text-based Person Search	CUHK-PEDES	Recall@156	90
Text-to-image Person Re-identification	ICFG-PEDES (test)	Rank-10.5098	81
Text-based Person Retrieval	ICFG-PEDES	R@150.98	76
Text-to-image Person Re-identification	CUHK-PEDES	Rank-155.97	51
Person Search	CUHK-PEDES (test)	Recall@155.97	47
Text-based Person Search	ICFG-PEDES	R@150.98	47

Showing 10 of 17 rows

Other info

Code

Follow for update

@wizwand_team Discord