Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VeriGraph: Scene Graphs for Execution Verifiable Robot Planning

About

Recent progress in vision-language models (VLMs) has opened new possibilities for robot task planning, but these models often produce incorrect action sequences. To address these limitations, we propose VeriGraph, a novel framework that integrates VLMs for robotic planning while verifying action feasibility. VeriGraph uses scene graphs as an intermediate representation to capture key objects and spatial relationships, enabling more reliable plan verification and refinement. The system generates a scene graph from input images and uses it to iteratively check and correct action sequences generated by an LLM-based task planner, ensuring constraints are respected and actions are executable. Our approach significantly enhances task completion rates across diverse manipulation scenarios, outperforming baseline methods by 58% on language-based tasks, 56% on tangram puzzle tasks, and 30% on image-based tasks. Qualitative results and code can be found at https://verigraph-agent.github.io.

Daniel Ekpo, Mara Levy, Saksham Suri, Chuong Huynh, Archana Swaminathan, Abhinav Shrivastava• 2024

Related benchmarks

TaskDatasetResultRank
Robot Task PlanningStacking
Success Rate65
4
Robot Task PlanningLanguage Instruction
Success Rate73
4
Robot Task PlanningRef. Image Instruction Kitchen
Success Rate55
4
Robot Task PlanningRef. Image Instruction Blocks
Success Rate86
3
Robot Task PlanningTangram puzzle
Success Rate72
3
Showing 5 of 5 rows

Other info

Follow for update