Scalable Scene Flow from Point Clouds in the Real World
About
Autonomous vehicles operate in highly dynamic environments necessitating an accurate assessment of which aspects of a scene are moving and where they are moving to. A popular approach to 3D motion estimation, termed scene flow, is to employ 3D point cloud data from consecutive LiDAR scans, although such approaches have been limited by the small size of real-world, annotated LiDAR data. In this work, we introduce a new large-scale dataset for scene flow estimation derived from corresponding tracked 3D objects, which is $\sim$1,000$\times$ larger than previous real-world datasets in terms of the number of annotated frames. We demonstrate how previous works were bounded based on the amount of real LiDAR data available, suggesting that larger datasets are required to achieve state-of-the-art predictive performance. Furthermore, we show how previous heuristics for operating on point clouds such as down-sampling heavily degrade performance, motivating a new class of models that are tractable on the full point cloud. To address this issue, we introduce the FastFlow3D architecture which provides real time inference on the full point cloud. Additionally, we design human-interpretable metrics that better capture real world aspects by accounting for ego-motion and providing breakdowns per object type. We hope that this dataset may provide new opportunities for developing real world scene flow systems.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Scene Flow Estimation | Argoverse 2 (test) | 3-way EPE0.062 | 27 | |
| LiDAR Scene Flow Estimation | Argoverse v2 (val) | EPE (m) - Dynamic Foreground0.1836 | 23 | |
| LiDAR Scene Flow Estimation | Waymo Open Dataset 1.0 (val) | Dynamic Foreground EPE (m)0.195 | 21 | |
| Scene Flow Estimation | Waymo Open Dataset (val) | -- | 17 | |
| Scene Flow Estimation | Waymo Open | Threeway EPE0.078 | 10 | |
| Scene Flow Estimation | Waymo Open Dataset Longer Temporal Horizon (5 consecutive frames) | Dynamic Foreground EPE (m)0.6968 | 8 | |
| Scene Flow Estimation | Argoverse Static Foreground v2 (test) | EPE (m)0.0198 | 7 | |
| Scene Flow Estimation | Argoverse Static Background v2 (test) | EPE (m)0.0064 | 7 | |
| LiDAR Scene Flow Estimation | Argoverse Successive time steps v2 | EPE (Dynamic Foreground)0.1836 | 7 | |
| Scene Flow Estimation | Argoverse Dynamic Foreground v2 (test) | EPE (m)0.1836 | 7 |