Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

OpenGraph: Open-Vocabulary Hierarchical 3D Graph Representation in Large-Scale Outdoor Environments

About

Environment representations endowed with sophisticated semantics are pivotal for facilitating seamless interaction between robots and humans, enabling them to effectively carry out various tasks. Open-vocabulary maps, powered by Visual-Language models (VLMs), possess inherent advantages, including zero-shot learning and support for open-set classes. However, existing open-vocabulary maps are primarily designed for small-scale environments, such as desktops or rooms, and are typically geared towards limited-area tasks involving robotic indoor navigation or in-place manipulation. They face challenges in direct generalization to outdoor environments characterized by numerous objects and complex tasks, owing to limitations in both understanding level and map structure. In this work, we propose OpenGraph, the first open-vocabulary hierarchical graph representation designed for large-scale outdoor environments. OpenGraph initially extracts instances and their captions from visual images, enhancing textual reasoning by encoding them. Subsequently, it achieves 3D incremental object-centric mapping with feature embedding by projecting images onto LiDAR point clouds. Finally, the environment is segmented based on lane graph connectivity to construct a hierarchical graph. Validation results from public dataset SemanticKITTI demonstrate that OpenGraph achieves the highest segmentation and query accuracy. The source code of OpenGraph is publicly available at https://github.com/BIT-DYN/OpenGraph.

Yinan Deng, Jiahui Wang, Jingyu Zhao, Xinyu Tian, Guangyan Chen, Yi Yang, Yufeng Yue• 2024

Related benchmarks

TaskDatasetResultRank
SpatialNaVQA Medium memory horizon
SR63
3
TemporalNaVQA Medium memory horizon
SR67
3
Multi-modal ReasoningWH-VQA
SR19
3
SpatialNaVQA Short memory horizon
SR68
3
SpatialNaVQA Long memory horizon
SR26
3
Spatial ReasoningWH-VQA
SR27
3
TemporalNaVQA Short memory horizon
SR89
3
TemporalNaVQA Long memory horizon
SR59
3
TextualNaVQA Short memory horizon
SR33
3
TextualNaVQA Medium memory horizon
SR43
3
Showing 10 of 12 rows

Other info

Follow for update