Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Cut and Learn for Unsupervised Object Detection and Instance Segmentation

About

We propose Cut-and-LEaRn (CutLER), a simple approach for training unsupervised object detection and segmentation models. We leverage the property of self-supervised models to 'discover' objects without supervision and amplify it to train a state-of-the-art localization model without any human labels. CutLER first uses our proposed MaskCut approach to generate coarse masks for multiple objects in an image and then learns a detector on these masks using our robust loss function. We further improve the performance by self-training the model on its predictions. Compared to prior work, CutLER is simpler, compatible with different detection architectures, and detects multiple objects. CutLER is also a zero-shot unsupervised detector and improves detection performance AP50 by over 2.7 times on 11 benchmarks across domains like video frames, paintings, sketches, etc. With finetuning, CutLER serves as a low-shot detector surpassing MoCo-v2 by 7.3% APbox and 6.6% APmask on COCO when training with 5% labels.

Xudong Wang, Rohit Girdhar, Stella X. Yu, Ishan Misra• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationADE20K (val)
mIoU35.7
2731
Object DetectionCOCO 2017 (val)--
2454
Instance SegmentationCOCO 2017 (val)--
1144
Semantic segmentationADE20K
mIoU35.7
936
Semantic segmentationCityscapes
mIoU18.7
578
Semantic segmentationCityscapes (val)
mIoU18.7
572
Video Instance SegmentationYouTube-VIS 2019 (val)
AP16
567
Semantic segmentationPASCAL VOC (val)
mIoU53.8
338
Semantic segmentationPASCAL Context (val)
mIoU43.4
323
3D Instance SegmentationScanNet V2 (val)
Average AP500.2
195
Showing 10 of 63 rows

Other info

Code

Follow for update