Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MixerCSeg: An Efficient Mixer Architecture for Crack Segmentation via Decoupled Mamba Attention

About

Feature encoders play a key role in pixel-level crack segmentation by shaping the representation of fine textures and thin structures. Existing CNN-, Transformer-, and Mamba-based models each capture only part of the required spatial or structural information, leaving clear gaps in modeling complex crack patterns. To address this, we present MixerCSeg, a mixer architecture designed like a coordinated team of specialists, where CNN-like pathways focus on local textures, Transformer-style paths capture global dependencies, and Mamba-inspired flows model sequential context within a single encoder. At the core of MixerCSeg is the TransMixer, which explores Mamba's latent attention behavior while establishing dedicated pathways that naturally express both locality and global awareness. To further enhance structural fidelity, we introduce a spatial block processing strategy and a Direction-guided Edge Gated Convolution (DEGConv) that strengthens edge sensitivity under irregular crack geometries with minimal computational overhead. A Spatial Refinement Multi-Level Fusion (SRF) module is then employed to refine multi-scale details without increasing complexity. Extensive experiments on multiple crack segmentation benchmarks show that MixerCSeg achieves state-of-the-art performance with only 2.05 GFLOPs and 2.54 M parameters, demonstrating both efficiency and strong representational capability. The code is available at https://github.com/spiderforest/MixerCSeg.

Zilong Zhao, Zhengming Ding, Pei Niu, Wenhao Sun, Feng Guo• 2026

Related benchmarks

TaskDatasetResultRank
Crack SegmentationCRACK500 (test)
mIoU78.24
20
Crack SegmentationDeepCrack (test)
mIoU91.51
16
Crack SegmentationCamCrack789 (test)
mIoU84.09
16
Crack SegmentationCrackMap (test)
mIoU81.23
16
Showing 4 of 4 rows

Other info

Follow for update