Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Where's Waldo: Diffusion Features for Personalized Segmentation and Retrieval

About

Personalized retrieval and segmentation aim to locate specific instances within a dataset based on an input image and a short description of the reference instance. While supervised methods are effective, they require extensive labeled data for training. Recently, self-supervised foundation models have been introduced to these tasks showing comparable results to supervised methods. However, a significant flaw in these models is evident: they struggle to locate a desired instance when other instances within the same class are presented. In this paper, we explore text-to-image diffusion models for these tasks. Specifically, we propose a novel approach called PDM for Personalized Features Diffusion Matching, that leverages intermediate features of pre-trained text-to-image models for personalization tasks without any additional training. PDM demonstrates superior performance on popular retrieval and segmentation benchmarks, outperforming even supervised methods. We also highlight notable shortcomings in current instance and segmentation datasets and propose new benchmarks for these tasks.

Dvir Samuel, Rami Ben-Ari, Matan Levy, Nir Darshan, Gal Chechik• 2024

Related benchmarks

TaskDatasetResultRank
Personalized RetrievalROxford (Medium)
mAP91.2
13
Personalized RetrievalROxford (Hard)
mAP80.3
13
Personalized RetrievalRParis (Medium)
mAP94
13
Personalized RetrievalRParis Hard
mAP86.8
13
Personalized RetrievalPerMIR 1.0 (test)
mAP0.73
9
Personalized Image SegmentationPerSeg (test)
mIoU97.4
8
Personalized Image SegmentationPerMIS Image (test)
mIoU49.7
8
Video label propagationDAVIS 2017 (val)
J&F Score78
7
Video label propagationPerMIS Video
J&F Score76.5
7
Showing 9 of 9 rows

Other info

Follow for update