Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization

About

Unsupervised localization and segmentation are long-standing computer vision challenges that involve decomposing an image into semantically-meaningful segments without any labeled data. These tasks are particularly interesting in an unsupervised setting due to the difficulty and cost of obtaining dense image annotations, but existing unsupervised approaches struggle with complex scenes containing multiple objects. Differently from existing methods, which are purely based on deep learning, we take inspiration from traditional spectral segmentation methods by reframing image decomposition as a graph partitioning problem. Specifically, we examine the eigenvectors of the Laplacian of a feature affinity matrix from self-supervised networks. We find that these eigenvectors already decompose an image into meaningful segments, and can be readily used to localize objects in a scene. Furthermore, by clustering the features associated with these segments across a dataset, we can obtain well-delineated, nameable regions, i.e. semantic segmentations. Experiments on complex datasets (Pascal VOC, MS-COCO) demonstrate that our simple spectral method outperforms the state-of-the-art in unsupervised localization and segmentation by a significant margin. Furthermore, our method can be readily used for a variety of complex image editing tasks, such as background removal and compositing.

Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina, Andrea Vedaldi• 2022

Related benchmarks

TaskDatasetResultRank
Semantic segmentationPASCAL VOC 2012 (test)
mIoU37.2
1342
Salient Object DetectionDUTS (test)--
302
Interactive SegmentationBerkeley
NoC@907.75
230
Interactive SegmentationGrabCut
NoC@904.64
225
Salient Object DetectionECSSD--
202
Interactive SegmentationDAVIS
NoC@9010.11
197
Interactive SegmentationSBD
NoC @ 90% Target11.57
171
Semantic segmentationCOCO Stuff-27 (val)
mIoU890
75
Object LocalizationPASCAL VOC 2012 (trainval)
CorLoc66.4
51
Salient Object DetectionECSSD 1,000 images (test)--
48
Showing 10 of 46 rows

Other info

Follow for update