BALD-SAM: Disagreement-based Active Prompting in Interactive Segmentation

About

The Segment Anything Model (SAM) has revolutionized interactive segmentation through spatial prompting. While existing work primarily focuses on automating prompts in various settings, real-world annotation workflows involve iterative refinement where annotators observe model outputs and strategically place prompts to resolve ambiguities. Current pipelines typically rely on the annotator's visual assessment of the predicted mask quality. We postulate that a principled approach for automated interactive prompting is to use a model-derived criterion to identify the most informative region for the next prompt. In this work, we establish active prompting: a spatial active learning approach where locations within images constitute an unlabeled pool and prompts serve as queries to prioritize information-rich regions, increasing the utility of each interaction. We further present BALD-SAM: a principled framework adapting Bayesian Active Learning by Disagreement (BALD) to spatial prompt selection by quantifying epistemic uncertainty. To do so, we freeze the entire model and apply Bayesian uncertainty modeling only to a small learned prediction head, making intractable uncertainty estimation practical for large multi-million parameter foundation models. Across 16 datasets spanning natural, medical, underwater, and seismic domains, BALD-SAM demonstrates strong cross-domain performance, ranking first or second on 14 of 16 benchmarks. We validate these gains through a comprehensive ablation suite covering 3 SAM backbones and 35 Laplace posterior configurations, amounting to 38 distinct ablation settings. Beyond strong average performance, BALD-SAM surpasses human prompting and, in several categories, even oracle prompting, while consistently outperforming one-shot baselines in final segmentation quality, particularly on thin and structurally complex objects.

Prithwijit Chowdhury, Mohit Prabhushankar, Ghassan AlRegib• 2026

Related benchmarks

Task	Dataset	Result
Interactive Segmentation	Dolphin	mIoU0.831	16
Active Prompting for Segmentation	Dolphin	Peak Normalized Δ IoU90.13	10
Interactive Segmentation	Bird	mIoU79.5	8
Interactive Segmentation	BUS	mIoU85.5	8
Interactive Segmentation	Tie	mIoU73.9	8
Interactive Segmentation	Stop sign	mIoU89.9	8
Interactive Segmentation	Polyp	mIoU81	8
Interactive Segmentation	Skin	mIoU69.3	8
Interactive Segmentation	Baseball bat	mIoU0.743	8
Interactive Segmentation	Cat	mIoU88.5	8

Showing 10 of 29 rows

Other info

Follow for update

@wizwand_team Discord