OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers
About
We present OSFormer, the first one-stage transformer framework for camouflaged instance segmentation (CIS). OSFormer is based on two key designs. First, we design a location-sensing transformer (LST) to obtain the location label and instance-aware parameters by introducing the location-guided queries and the blend-convolution feedforward network. Second, we develop a coarse-to-fine fusion (CFF) to merge diverse context information from the LST encoder and CNN backbone. Coupling these two components enables OSFormer to efficiently blend local features and long-range context dependencies for predicting camouflaged instances. Compared with two-stage frameworks, our OSFormer reaches 41% AP and achieves good convergence efficiency without requiring enormous training data, i.e., only 3,040 samples under 60 epochs. Code link: https://github.com/PJLallen/OSFormer.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Camouflaged Object Segmentation | CAMO (test) | S-measure (S_alpha)0.799 | 56 | |
| Camouflaged Object Segmentation | NC4K | Fw_beta79 | 41 | |
| Camouflaged Object Segmentation | Chameleon | Fw_beta83.6 | 28 | |
| Instance Segmentation | COD10K v3 (test) | AP41 | 27 | |
| Instance Segmentation | NC4K | AP42.5 | 27 | |
| Instance Segmentation | COME15K E | mAP53 | 23 | |
| Instance Segmentation | COME15K-H | mAP45.8 | 23 | |
| Instance Segmentation | DSIS | mAP67.9 | 23 | |
| Instance Segmentation | SIP | mAP63.2 | 23 |