FEANet: Feature-Enhanced Attention Network for RGB-Thermal Real-time Semantic Segmentation
About
The RGB-Thermal (RGB-T) information for semantic segmentation has been extensively explored in recent years. However, most existing RGB-T semantic segmentation usually compromises spatial resolution to achieve real-time inference speed, which leads to poor performance. To better extract detail spatial information, we propose a two-stage Feature-Enhanced Attention Network (FEANet) for the RGB-T semantic segmentation task. Specifically, we introduce a Feature-Enhanced Attention Module (FEAM) to excavate and enhance multi-level features from both the channel and spatial views. Benefited from the proposed FEAM module, our FEANet can preserve the spatial information and shift more attention to high-resolution features from the fused RGB-T images. Extensive experiments on the urban scene dataset demonstrate that our FEANet outperforms other state-of-the-art (SOTA) RGB-T methods in terms of objective metrics and subjective visual comparison (+2.6% in global mAcc and +0.8% in global mIoU). For the 480 x 640 RGB-T test images, our FEANet can run with a real-time speed on an NVIDIA GeForce RTX 2080 Ti card.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic segmentation | MFNet (test) | mIoU55.3 | 134 | |
| Semantic segmentation | PST900 (test) | mIoU85.5 | 72 | |
| Semantic segmentation | FMB (test) | mIoU46.8 | 59 | |
| Semantic segmentation | MFNet day-night (test) | Car IoU87.8 | 20 | |
| Semantic segmentation | MFNet | mIoU55.3 | 13 | |
| Semantic segmentation | MFNet RGB-T 2017 (test) | mIoU55.3 | 13 | |
| Semantic segmentation | MF day-night 11 (evaluation set) | Unlabeled IoU98.3 | 12 | |
| Scene Parsing | PST900 (test) | mIoU85.5 | 11 | |
| RGB-T Semantic Segmentation | MFNet | Latency (ms)28.52 | 4 |