End-to-End Instance Segmentation with Recurrent Attention
About
While convolutional neural networks have gained impressive success recently in solving structured prediction problems such as semantic segmentation, it remains a challenge to differentiate individual object instances in the scene. Instance segmentation is very important in a variety of applications, such as autonomous driving, image captioning, and visual question answering. Techniques that combine large graphical models with low-level vision have been proposed to address this problem; however, we propose an end-to-end recurrent neural network (RNN) architecture with an attention mechanism to model a human-like counting process, and produce detailed instance segmentations. The network is jointly trained to sequentially produce regions of interest as well as a dominant object segmentation within each region. The proposed model achieves competitive results on the CVPPP, KITTI, and Cityscapes datasets.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Instance Segmentation | Cityscapes (test) | AP (Overall)9.5 | 122 | |
| Panoptic Segmentation | Cityscapes (test) | -- | 51 | |
| Leaf Segmentation | CVPPA LSC (test) | BestDice A181.9 | 14 | |
| Leaf Segmentation | CVPPP LSC (A1) | mSBD0.849 | 10 | |
| Leaf Segmentation | CVPPP leaf segmentation challenge (test) | DiC0.8 | 8 | |
| Instance Segmentation | CVPPP (test) | SBD0.849 | 6 |