Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BALD-SAM: Disagreement-based Active Prompting in Interactive Segmentation

About

The Segment Anything Model (SAM) has revolutionized interactive segmentation through spatial prompting. While existing work primarily focuses on automating prompts in various settings, real-world annotation workflows involve iterative refinement where annotators observe model outputs and strategically place prompts to resolve ambiguities. Current pipelines typically rely on the annotator's visual assessment of the predicted mask quality. We postulate that a principled approach for automated interactive prompting is to use a model-derived criterion to identify the most informative region for the next prompt. In this work, we establish active prompting: a spatial active learning approach where locations within images constitute an unlabeled pool and prompts serve as queries to prioritize information-rich regions, increasing the utility of each interaction. We further present BALD-SAM: a principled framework adapting Bayesian Active Learning by Disagreement (BALD) to spatial prompt selection by quantifying epistemic uncertainty. To do so, we freeze the entire model and apply Bayesian uncertainty modeling only to a small learned prediction head, making intractable uncertainty estimation practical for large multi-million parameter foundation models. Across 16 datasets spanning natural, medical, underwater, and seismic domains, BALD-SAM demonstrates strong cross-domain performance, ranking first or second on 14 of 16 benchmarks. We validate these gains through a comprehensive ablation suite covering 3 SAM backbones and 35 Laplace posterior configurations, amounting to 38 distinct ablation settings. Beyond strong average performance, BALD-SAM surpasses human prompting and, in several categories, even oracle prompting, while consistently outperforming one-shot baselines in final segmentation quality, particularly on thin and structurally complex objects.

Prithwijit Chowdhury, Mohit Prabhushankar, Ghassan AlRegib• 2026

Related benchmarks

TaskDatasetResultRank
Interactive SegmentationDolphin
mIoU0.831
16
Active Prompting for SegmentationDolphin
Peak Normalized Δ IoU90.13
10
Interactive SegmentationBird
mIoU79.5
8
Interactive SegmentationBUS
mIoU85.5
8
Interactive SegmentationTie
mIoU73.9
8
Interactive SegmentationStop sign
mIoU89.9
8
Interactive SegmentationPolyp
mIoU81
8
Interactive SegmentationSkin
mIoU69.3
8
Interactive SegmentationBaseball bat
mIoU0.743
8
Interactive SegmentationCat
mIoU88.5
8
Showing 10 of 29 rows

Other info

Follow for update