Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

About

Autonomous robot navigation and manipulation in open environments require reasoning and replanning with closed-loop feedback. In this work, we present COME-robot, the first closed-loop robotic system utilizing the GPT-4V vision-language foundation model for open-ended reasoning and adaptive planning in real-world scenarios.COME-robot incorporates two key innovative modules: (i) a multi-level open-vocabulary perception and situated reasoning module that enables effective exploration of the 3D environment and target object identification using commonsense knowledge and situated information, and (ii) an iterative closed-loop feedback and restoration mechanism that verifies task feasibility, monitors execution success, and traces failure causes across different modules for robust failure recovery. Through comprehensive experiments involving 8 challenging real-world mobile and tabletop manipulation tasks, COME-robot demonstrates a significant improvement in task success rate (~35%) compared to state-of-the-art methods. We further conduct comprehensive analyses to elucidate how COME-robot's design facilitates failure recovery, free-form instruction following, and long-horizon task planning.

Peiyuan Zhi, Zhiyuan Zhang, Yu Zhao, Muzhi Han, Zeyu Zhang, Zhitian Li, Ziyuan Jiao, Baoxiong Jia, Siyuan Huang• 2024

Related benchmarks

TaskDatasetResultRank
Object retrieval and navigationMansionWorld Four floors
Success Rate0.00e+0
3
Object retrieval and navigationMansionWorld Double floors
Overall Success Rate20
3
Object retrieval and navigationMansionWorld Single floor
Overall Success Rate30
3
Move crumpled paper (brush nearby)Custom Robot Manipulation Scenes 1.0 (test)
Success Rate25
2
Move egg (open view)Custom Robot Manipulation Scenes 1.0 (test)
Success Rate0.2
2
Move grape/cherry (open view)Custom Robot Manipulation Scenes 1.0 (test)
Success Rate20
2
Move screw (towel nearby)Custom Robot Manipulation Scenes 1.0 (test)
Success Rate0.00e+0
2
Move sushi (open view)Custom Robot Manipulation Scenes 1.0 (test)
Success Rate14
2
Move tiny candy (towel nearby)Custom Robot Manipulation Scenes 1.0 (test)
Success Rate0.11
2
Pick up bowl (apple inside)Custom Robot Manipulation Scenes 1.0 (test)
Success Rate17
2
Showing 10 of 15 rows

Other info

Follow for update