Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OMCL: Open-vocabulary Monte Carlo Localization

About

Robust robot localization is an important prerequisite for navigation, but it becomes challenging when the map and robot measurements are obtained from different sensors. Prior methods are often tailored to specific environments, relying on closed-set semantics or fine-tuned features. In this work, we extend Monte Carlo Localization with vision-language features, allowing OMCL to robustly compute the likelihood of visual observations given a camera pose and a 3D map created from posed RGB-D images or aligned point clouds. These open-vocabulary features enable us to associate observations and map elements from different modalities, and to natively initialize global localization through natural language descriptions of nearby objects. We evaluate our approach using Matterport3D and Replica for indoor scenes and demonstrate generalization on SemanticKITTI for outdoor scenes.

Evgenii Kruzhkov, Raphael Memmesheimer, Sven Behnke• 2025

Related benchmarks

TaskDatasetResultRank
3D Semantic SegmentationReplica (test)
mIoU (All)32.1
10
2D LocalizationMatterport3D
RMSE (m)0.11
6
Visual LocalizationKITTI Sequence 00
Translation Mean (m)0.52
6
3D Semantic SegmentationReplica 3D
mIoU32.1
5
3D LocalizationMatterport3D
RMSE (m)0.15
4
Trajectory EstimationReplica 3D
ATE RMSE35
3
LocalizationReplica (test)
ATE RMSE35
1
Showing 7 of 7 rows

Other info

Follow for update