Taming SAM3 in the Wild: A Concept Bank for Open-Vocabulary Segmentation
About
The recent introduction of \texttt{SAM3} has revolutionized Open-Vocabulary Segmentation (OVS) through \textit{promptable concept segmentation}, which grounds pixel predictions in flexible concept prompts. However, this reliance on pre-defined concepts makes the model vulnerable: when visual distributions shift (\textit{data drift}) or conditional label distributions evolve (\textit{concept drift}) in the target domain, the alignment between visual evidence and prompts breaks down. In this work, we present \textsc{ConceptBank}, a parameter-free calibration framework to restore this alignment on the fly. Instead of adhering to static prompts, we construct a dataset-specific concept bank from the target statistics. Our approach (\textit{i}) anchors target-domain evidence via class-wise visual prototypes, (\textit{ii}) mines representative supports to suppress outliers under data drift, and (\textit{iii}) fuses candidate concepts to rectify concept drift. We demonstrate that \textsc{ConceptBank} effectively adapts \texttt{SAM3} to distribution drifts, including challenging natural-scene and remote-sensing scenarios, establishing a new baseline for robustness and efficiency in OVS. Code and model are available at https://github.com/pgsmall/ConceptBank.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Open Vocabulary Semantic Segmentation | COCOStuff (val) | mIoU46.4 | 60 | |
| Open Vocabulary Semantic Segmentation | Cityscapes (val) | mIoU75.1 | 37 | |
| Open Vocabulary Semantic Segmentation | PASCAL Context 59 (val) | mIoU63 | 32 | |
| Open-Vocabulary Segmentation | Pascal VOC 21 2012 (val) | mIoU87.1 | 27 | |
| Open-Vocabulary Segmentation | Pascal Context 60 (val) | mIoU56.5 | 26 | |
| Open-Vocabulary Segmentation | COCO-Object (COCO-O) (val) | mIoU67.9 | 25 | |
| Open-Vocabulary Segmentation | ADE20K (ADE) (val) | mIoU43.3 | 25 | |
| Open-Vocabulary Segmentation | Pascal VOC 20 2012 (val) | mIoU97.4 | 23 | |
| Open-Vocabulary Segmentation | Vaihingen | mIoU63 | 21 | |
| Open-Vocabulary Segmentation | LoveDA | mIoU49.4 | 21 |