Bridging Structure and Appearance: Topological Features for Robust Self-Supervised Segmentation

About

Self-supervised semantic segmentation methods often fail when faced with appearance ambiguities. We argue that this is due to an over-reliance on unstable, appearance-based features such as shadows, glare, and local textures. We propose \textbf{GASeg}, a novel framework that bridges appearance and geometry by leveraging stable topological information. The core of our method is Differentiable Box-Counting (\textbf{DBC}) module, which quantifies multi-scale topological statistics from two parallel streams: geometric-based features and appearance-based features. To force the model to learn these stable structural representations, we introduce Topological Augmentation (\textbf{TopoAug}), an adversarial strategy that simulates real-world ambiguities by applying morphological operators to the input images. A multi-objective loss, \textbf{GALoss}, then explicitly enforces cross-modal alignment between geometric-based and appearance-based features. Extensive experiments demonstrate that GASeg achieves state-of-the-art performance on four benchmarks, including COCO-Stuff, Cityscapes, and PASCAL, validating our approach of bridging geometry and appearance via topological information.

Haotang Li, Zhenyu Qi, Hao Qin, Huanrui Yang, Sen He, Kebin Peng• 2025

Related benchmarks

Task	Dataset	Result
Semantic segmentation	Cityscapes	mIoU23.2	578
Semantic segmentation	COCO Stuff	mIoU30.1	195
Semantic segmentation	PASCAL VOC 2012	mIoU55.9	187
Semantic segmentation	Potsdam	Accuracy85.3	13

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord