Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dynamic Uncertainty Learning with Noisy Correspondence for Text-Based Person Search

About

Text-to-image person search aims to identify an individual based on a text description. To reduce data collection costs, large-scale text-image datasets are created from co-occurrence pairs found online. However, this can introduce noise, particularly mismatched pairs, which degrade retrieval performance. Existing methods often focus on negative samples, which amplify this noise. To address these issues, we propose the Dynamic Uncertainty and Relational Alignment (DURA) framework, which includes the Key Feature Selector (KFS) and a new loss function, Dynamic Softmax Hinge Loss (DSH-Loss). KFS captures and models noise uncertainty, improving retrieval reliability. The bidirectional evidence from cross-modal similarity is modeled as a Dirichlet distribution, enhancing adaptability to noisy data. DSH adjusts the difficulty of negative samples to improve robustness in noisy environments. Our experiments on three datasets show that the method offers strong noise resistance and improves retrieval performance in both low- and high-noise scenarios.

Zequn Xie, Haoming Ji, Chengxuan Li, Lingwei Meng• 2025

Related benchmarks

TaskDatasetResultRank
Text-to-image person searchCUHK-PEDES 20% noise
Rank-1 (R-1)75.04
14
Text-to-image person searchICFG-PEDES 20% noise
R-1 Accuracy66.62
14
Text-to-image person searchRSTPReid 20% noise
R-165.05
14
Text-to-image person searchICFG-PEDES 50% noise
R-164.08
14
Text-to-image person searchRSTPReid 50% noise
R-162.95
14
Text-to-image person searchCUHK-PEDES 50% noise
R-170.89
14
Text-to-image person searchCUHK-PEDES 0% noise
Rank-176.14
7
Text-to-image person searchICFG-PEDES 0% noise
R-167.88
7
Text-to-image person searchRSTPReid 0% noise
Rank-1 Accuracy66.15
7
Showing 9 of 9 rows

Other info

Follow for update