Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning

About

Salient object detection (SOD) and camouflaged object detection (COD) are related yet distinct binary mapping tasks. These tasks involve multiple modalities, sharing commonalities and unique cues. Existing research often employs intricate task-specific specialist models, potentially leading to redundancy and suboptimal results. We introduce VSCode, a generalist model with novel 2D prompt learning, to jointly address four SOD tasks and three COD tasks. We utilize VST as the foundation model and introduce 2D prompts within the encoder-decoder architecture to learn domain and task-specific knowledge on two separate dimensions. A prompt discrimination loss helps disentangle peculiarities to benefit model optimization. VSCode outperforms state-of-the-art methods across six tasks on 26 datasets and exhibits zero-shot generalization to unseen tasks by combining 2D prompts, such as RGB-D COD. Source code has been available at https://github.com/Sssssuperior/VSCode.

Ziyang Luo, Nian Liu, Wangbo Zhao, Xuguang Yang, Dingwen Zhang, Deng-Ping Fan, Fahad Khan, Junwei Han• 2023

Related benchmarks

TaskDatasetResultRank
RGB-D Salient Object DetectionSTERE
S-measure (Sα)0.931
198
Salient Object DetectionPASCAL-S--
186
RGB-D Salient Object DetectionSIP
S-measure (Sα)0.924
124
Camouflaged Object DetectionCOD10K
S-measure (S_alpha)0.882
83
RGB-D Salient Object DetectionNLPR (test)
S-measure (Sα)94.1
71
RGB-D Saliency DetectionNLPR
Max F-beta0.932
65
Camouflaged Object DetectionCAMO 250 (test)
M (Mean Score)0.046
59
RGB-D Salient Object DetectionNJUD
S-measure94.4
54
Salient Object DetectionVT5000
S-Measure0.925
50
Concealed Object DetectionNC4K--
46
Showing 10 of 46 rows

Other info

Code

Follow for update