Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Decouple and Rectify: Semantics-Preserving Structural Enhancement for Open-Vocabulary Remote Sensing Segmentation

About

Open-vocabulary semantic segmentation in the remote sensing (RS) field requires both language-aligned recognition and fine-grained spatial delineation. Although CLIP offers robust semantic generalization, its global-aligned visual representations inherently struggle to capture structural details. Recent methods attempt to compensate for this by introducing RS-pretrained DINO features. However, these methods treat CLIP representations as a monolithic semantic space and cannot localize where structural enhancement is required, failing to effectively delineate boundaries while risking the disruption of CLIP's semantic integrity. To address this limitation, we propose DR-Seg, a novel decouple-and-rectify framework in this paper. Our method is motivated by the key observation that CLIP feature channels exhibit distinct functional heterogeneity rather than forming a uniform semantic space. Building on this insight, DR-Seg decouples CLIP features into semantics-dominated and structure-dominated subspaces, enabling targeted structural enhancement by DINO without distorting language-aligned semantics. Subsequently, a prior-driven graph rectification module injects high-fidelity structural priors under DINO guidance to form a refined branch, while an uncertainty-guided adaptive fusion module dynamically integrates this refined branch with the original CLIP branch for final prediction. Comprehensive experiments across eight benchmarks demonstrate that DR-Seg establishes a new state-of-the-art.

Jie Feng, Fengze Li, Junpeng Zhang, Siyu Chen, Yuping Liang, Junying Chen, Ronghua Shang• 2026

Related benchmarks

TaskDatasetResultRank
Semantic segmentationVaihingen
mIoU47.99
140
Semantic segmentationiSAID
mIoU94.61
122
Semantic segmentationLoveDA
mIoU34.1
92
Semantic segmentationVDD
mIoU41
76
Semantic segmentationUAVid
mIoU28.45
68
Semantic segmentationUDD5
mIoU46.63
63
Semantic segmentationPotsdam
mIoU45.91
32
Semantic segmentationMean over 8 Remote Sensing Datasets
Mean mIoU49.01
32
Semantic segmentationDLRSD
mIoU91.13
32
Showing 9 of 9 rows

Other info

Follow for update