AMVNet: Assertion-based Multi-View Fusion Network for LiDAR Semantic Segmentation
About
In this paper, we present an Assertion-based Multi-View Fusion network (AMVNet) for LiDAR semantic segmentation which aggregates the semantic features of individual projection-based networks using late fusion. Given class scores from different projection-based networks, we perform assertion-guided point sampling on score disagreements and pass a set of point-level features for each sampled point to a simple point head which refines the predictions. This modular-and-hierarchical late fusion approach provides the flexibility of having two independent networks with a minor overhead from a light-weight network. Such approaches are desirable for robotic systems, e.g. autonomous vehicles, for which the computational and memory resources are often limited. Extensive experiments show that AMVNet achieves state-of-the-art results in both the SemanticKITTI and nuScenes benchmark datasets and that our approach outperforms the baseline method of combining the class scores of the projection-based networks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic segmentation | SemanticKITTI (test) | mIoU65.3 | 335 | |
| Semantic segmentation | nuScenes (val) | mIoU (Segmentation)0.761 | 212 | |
| LiDAR Semantic Segmentation | nuScenes (val) | mIoU77.2 | 169 | |
| LiDAR Semantic Segmentation | nuScenes official (test) | mIoU77.3 | 132 | |
| LiDAR Semantic Segmentation | SemanticKITTI (test) | mIoU65.3 | 125 | |
| LiDAR Semantic Segmentation | SemanticKITTI (val) | mIoU65.2 | 87 | |
| Semantic segmentation | nuScenes (test) | mIoU77.3 | 75 | |
| Semantic segmentation | SemanticKITTI single-scan | mIoU65.3 | 46 | |
| 3D Semantic Segmentation | nuScenes (test) | mIoU77.27 | 36 | |
| Semantic segmentation | nuScenes 1.0 (val) | mIoU76.1 | 29 |