Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CLIP-Count: Towards Text-Guided Zero-Shot Object Counting

About

Recent advances in visual-language models have shown remarkable zero-shot text-image matching ability that is transferable to downstream tasks such as object detection and segmentation. Adapting these models for object counting, however, remains a formidable challenge. In this study, we first investigate transferring vision-language models (VLMs) for class-agnostic object counting. Specifically, we propose CLIP-Count, the first end-to-end pipeline that estimates density maps for open-vocabulary objects with text guidance in a zero-shot manner. To align the text embedding with dense visual features, we introduce a patch-text contrastive loss that guides the model to learn informative patch-level visual representations for dense prediction. Moreover, we design a hierarchical patch-text interaction module to propagate semantic information across different resolution levels of visual features. Benefiting from the full exploitation of the rich image-text alignment knowledge of pretrained VLMs, our method effectively generates high-quality density maps for objects-of-interest. Extensive experiments on FSC-147, CARPK, and ShanghaiTech crowd counting datasets demonstrate state-of-the-art accuracy and generalizability of the proposed method. Code is available: https://github.com/songrise/CLIP-Count.

Ruixiang Jiang, Lingbo Liu, Changwen Chen• 2023

Related benchmarks

TaskDatasetResultRank
Object CountingFSC-147 (test)
MAE17.78
297
Crowd CountingShanghaiTech Part A (test)
MAE192.6
227
Object CountingFSC-147 (val)
MAE18.76
211
Crowd CountingShanghaiTech Part B (test)
MAE45.7
191
Crowd CountingShanghaiTech Part B
MAE45.7
160
Crowd CountingShanghaiTech Part A
MAE192.6
138
Car Object CountingCARPK (test)
MAE11.96
116
CountingCARPK
MAE11.7
41
Object CountingPASCAL VOC Count 2007 (test)
mRMSE32.7
40
Crowd CountingShanghaiTech B 12 (test)
MAE45.7
10
Showing 10 of 14 rows

Other info

Follow for update