Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Integrating Extra Modality Helps Segmentor Find Camouflaged Objects Well

About

Camouflaged Object Segmentation (COS) remains challenging because camouflaged objects exhibit only subtle visual differences from their backgrounds and single-modality RGB methods provide limited cues, leading researchers to explore multimodal data to improve segmentation accuracy. In this work, we presenet MultiCOS, a novel framework that effectively leverages diverse data modalities to improve segmentation performance. MultiCOS comprises two modules: Bi-space Fusion Segmentor (BFSer), which employs a state space and a latent space fusion mechanism to integrate cross-modal features within a shared representation and employs a fusion-feedback mechanism to refine context-specific features, and Cross-modal Knowledge Learner (CKLer), which leverages external multimodal datasets to generate pseudo-modal inputs and establish cross-modal semantic associations, transferring knowledge to COS models when real multimodal pairs are missing. When real multimodal COS data are unavailable, CKLer yields additional segmentation gains using only non-COS multimodal sources. Experiments on standard COS benchmarks show that BFSer outperforms existing multimodal baselines with both real and pseudo-modal data. Code will be released at \href{https://github.com/cnyvfang/MultiCOS}{GitHub}.

Chengyu Fang, Chunming He, Longxiang Tang, Yuelin Zhang, Chenyang Zhu, Yuqi Shen, Chubin Chen, Guoxia Xu, Xiu Li• 2025

Related benchmarks

TaskDatasetResultRank
Camouflaged Object DetectionCOD10K (test)
S-measure (S_alpha)0.88
224
Camouflaged Object DetectionChameleon
S-measure (S_alpha)92.3
150
Camouflaged Object DetectionCAMO (test)--
111
Camouflaged Object DetectionNC4K
M Score0.031
67
Showing 4 of 4 rows

Other info

Follow for update