Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SegGPT: Segmenting Everything In Context

About

We present SegGPT, a generalist model for segmenting everything in context. We unify various segmentation tasks into a generalist in-context learning framework that accommodates different kinds of segmentation data by transforming them into the same format of images. The training of SegGPT is formulated as an in-context coloring problem with random color mapping for each data sample. The objective is to accomplish diverse tasks according to the context, rather than relying on specific colors. After training, SegGPT can perform arbitrary segmentation tasks in images or videos via in-context inference, such as object instance, stuff, part, contour, and text. SegGPT is evaluated on a broad range of tasks, including few-shot semantic segmentation, video object segmentation, semantic segmentation, and panoptic segmentation. Our results show strong capabilities in segmenting in-domain and out-of-domain targets, either qualitatively or quantitatively.

Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationADE20K (val)
mIoU39.9
2731
Video Object SegmentationDAVIS 2017 (val)
J mean72.5
1130
Semantic segmentationADE20K
mIoU39.6
936
Video Object SegmentationDAVIS 2016 (val)
J Mean83.6
564
Panoptic SegmentationCOCO 2017 (val)
PQ34.4
172
Semantic segmentationPASCAL-5i
Mean mIoU89.8
111
Few-shot Semantic SegmentationCOCO-20i (test)--
79
Semantic segmentationCOCO 20^i (test)
mIoU67.9
48
Video Object SegmentationYouTube-VOS 2018
Score G74.7
47
Video Object SegmentationDAVIS 2017
Jaccard Index (J)72.5
42
Showing 10 of 30 rows

Other info

Code

Follow for update