Unite-Divide-Unite: Joint Boosting Trunk and Structure for High-accuracy Dichotomous Image Segmentation
About
High-accuracy Dichotomous Image Segmentation (DIS) aims to pinpoint category-agnostic foreground objects from natural scenes. The main challenge for DIS involves identifying the highly accurate dominant area while rendering detailed object structure. However, directly using a general encoder-decoder architecture may result in an oversupply of high-level features and neglect the shallow spatial information necessary for partitioning meticulous structures. To fill this gap, we introduce a novel Unite-Divide-Unite Network (UDUN} that restructures and bipartitely arranges complementary features to simultaneously boost the effectiveness of trunk and structure identification. The proposed UDUN proceeds from several strengths. First, a dual-size input feeds into the shared backbone to produce more holistic and detailed features while keeping the model lightweight. Second, a simple Divide-and-Conquer Module (DCM) is proposed to decouple multiscale low- and high-level features into our structure decoder and trunk decoder to obtain structure and trunk information respectively. Moreover, we design a Trunk-Structure Aggregation module (TSA) in our union decoder that performs cascade integration for uniform high-accuracy segmentation. As a result, UDUN performs favorably against state-of-the-art competitors in all six evaluation metrics on overall DIS-TE, i.e., achieving 0.772 weighted F-measure and 977 HCE. Using 1024*1024 input, our model enables real-time inference at 65.3 fps with ResNet-18.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Dichotomous Image Segmentation | DIS5K (DIS-VD) | S_alpha0.838 | 30 | |
| Dichotomous Image Segmentation | DIS5K TE (1-4) (test) | Fw_beta83.1 | 25 | |
| Dichotomous Image Segmentation | DIS5K (val) | Fw_beta0.823 | 18 | |
| Dichotomous Image Segmentation | DIS-TE2 500 (test) | Fmax82.9 | 14 | |
| Dichotomous Image Segmentation | DIS-TE3 500 (test) | Fmax86.5 | 14 | |
| Dichotomous Image Segmentation | DIS TE4 500 (test) | Fmax84.6 | 14 | |
| Dichotomous Image Segmentation | DIS 470 (val) | Fmax0.823 | 14 | |
| Dichotomous Image Segmentation | DIS TE1 500 (test) | Fmax78.4 | 14 | |
| Dichotomous Image Segmentation | DIS ALL 2,000 (test) | Fmax83.1 | 14 | |
| Dichotomous Image Segmentation | DIS5K DIS-TE1 (test) | Fmax78.4 | 12 |