Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VL-SAM-v3: Memory-Guided Visual Priors for Open-World Object Detection

About

Open-world object detection aims to localize and recognize objects beyond a fixed closed-set label space. It is commonly divided into two categories, i.e., open-vocabulary detection, which assumes a predefined category list at test time, and open-ended detection, which requires generating candidate categories during the inference. Existing methods rely primarily on coarse textual semantics and parametric knowledge, which often provide insufficient visual evidence for fine-grained appearance variation, rare categories, and cluttered scenes. In this paper, we propose VL-SAM-v3, a unified framework that augments open-world detection with retrieval-grounded external visual memory. Specifically, once candidate categories are available, VL-SAM-v3 retrieves relevant visual prototypes from a non-parametric memory bank and transforms them into two complementary visual priors, i.e., sparse priors for instance-level spatial anchoring and dense priors for class-aware local context. These priors are integrated with the original detection prompts via Memory-Guided Prompt Refinement, enabling a shared retrieval-and-refinement mechanism that supports open-vocabulary and open-ended inference. Extensive zero-shot experiments on LVIS show that VL-SAM-v3 consistently improves detection performance under both open-vocabulary and open-ended inference, with particularly strong gains on rare categories. Moreover, experiments with a stronger open-vocabulary detector (i.e., SAM3) validate the generality of the proposed retrieval-and-refinement mechanism.

Chih-Chung Liu, Zhiwei Lin, Yongtao Wang• 2026

Related benchmarks

TaskDatasetResultRank
Object DetectionLVIS (val)
mAP54.1
170
Object DetectionLVIS (minival)
AP51.7
159
Object DetectionLVIS mini (val)
mAP60.2
120
Object DetectionCOCO
AP56.8
21
Open-ended instance segmentationLVIS mini (val)
AP (Mask)39.9
3
Showing 5 of 5 rows

Other info

Follow for update