Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not

About

In this paper, we leverage CLIP for zero-shot sketch based image retrieval (ZS-SBIR). We are largely inspired by recent advances on foundation models and the unparalleled generalisation ability they seem to offer, but for the first time tailor it to benefit the sketch community. We put forward novel designs on how best to achieve this synergy, for both the category setting and the fine-grained setting ("all"). At the very core of our solution is a prompt learning setup. First we show just via factoring in sketch-specific prompts, we already have a category-level ZS-SBIR system that overshoots all prior arts, by a large margin (24.8%) - a great testimony on studying the CLIP and ZS-SBIR synergy. Moving onto the fine-grained setup is however trickier, and requires a deeper dive into this synergy. For that, we come up with two specific designs to tackle the fine-grained matching nature of the problem: (i) an additional regularisation loss to ensure the relative separation between sketches and photos is uniform across categories, which is not the case for the gold standard standalone triplet loss, and (ii) a clever patch shuffling technique to help establishing instance-level structural correspondences between sketch-photo pairs. With these designs, we again observe significant performance gains in the region of 26.9% over previous state-of-the-art. The take-home message, if any, is the proposed CLIP and prompt learning paradigm carries great promise in tackling other sketch-related tasks (not limited to ZS-SBIR) where data scarcity remains a great challenge. Project page: https://aneeshan95.github.io/Sketch_LVM/

Aneeshan Sain, Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Subhadeep Koley, Tao Xiang, Yi-Zhe Song• 2023

Related benchmarks

TaskDatasetResultRank
Zero-Shot Sketch-Based Image RetrievalTU-Berlin
mAP@all65.1
18
Zero-Shot Sketch-Based Image RetrievalSketchy
mAP@2000.723
17
Sketch-based image retrievalTU-Berlin
mAP63.1
15
Sketch-based image retrievalSketchy
mAP@20071.3
15
Sketch-based image retrievalQuickDraw
mAP20.2
15
Zero-Shot Sketch-Based Image RetrievalQuickDraw
mAP@all0.202
12
Cross-category Fine-Grained Zero-Shot Sketch-Based Image RetrievalSketchy
Top-1 Acc28.68
9
Showing 7 of 7 rows

Other info

Code

Follow for update