Khronos: A Unified Approach for Spatio-Temporal Metric-Semantic SLAM in Dynamic Environments
About
Perceiving and understanding highly dynamic and changing environments is a crucial capability for robot autonomy. While large strides have been made towards developing dynamic SLAM approaches that estimate the robot pose accurately, a lesser emphasis has been put on the construction of dense spatio-temporal representations of the robot environment. A detailed understanding of the scene and its evolution through time is crucial for long-term robot autonomy and essential to tasks that require long-term reasoning, such as operating effectively in environments shared with humans and other agents and thus are subject to short and long-term dynamics. To address this challenge, this work defines the Spatio-temporal Metric-semantic SLAM (SMS) problem, and presents a framework to factorize and solve it efficiently. We show that the proposed factorization suggests a natural organization of a spatio-temporal perception system, where a fast process tracks short-term dynamics in an active temporal window, while a slower process reasons over long-term changes in the environment using a factor graph formulation. We provide an efficient implementation of the proposed spatio-temporal perception approach, that we call Khronos, and show that it unifies exiting interpretations of short-term and long-term dynamics and is able to construct a dense spatio-temporal map in real-time. We provide simulated and real results, showing that the spatio-temporal maps built by Khronos are an accurate reflection of a 3D scene over time and that Khronos outperforms baselines across multiple metrics. We further validate our approach on two heterogeneous robots in challenging, large-scale real-world environments.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Open-set 3D object extraction | Clio Apartment | osR45 | 11 | |
| Open-set 3D object extraction | Clio Cubicle | osR78 | 11 | |
| Open-set 3D object extraction | Clio Office | osR67 | 9 | |
| Scene-change task success | Scene-change two-session protocol (post-change navigation) | Relocation Success Rate18 | 5 | |
| Articulated 3D Scene Graph Reconstruction | Bedroom simulated | Mesh Recovery Precision98.2 | 4 | |
| Articulated 3D Scene Graph Reconstruction | Kitchen simulated | Mesh Recovery Precision92.7 | 4 |