Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Enabling Visual Action Planning for Object Manipulation through Latent Space Roadmap

About

We present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces, focusing on manipulation of deformable objects. We propose a Latent Space Roadmap (LSR) for task planning which is a graph-based structure globally capturing the system dynamics in a low-dimensional latent space. Our framework consists of three parts: (1) a Mapping Module (MM) that maps observations given in the form of images into a structured latent space extracting the respective states as well as generates observations from the latent states, (2) the LSR which builds and connects clusters containing similar states in order to find the latent plans between start and goal states extracted by MM, and (3) the Action Proposal Module that complements the latent plan found by the LSR with the corresponding actions. We present a thorough investigation of our framework on simulated box stacking and rope/box manipulation tasks, and a folding task executed on a real robot.

Martina Lippi, Petra Poklukar, Michael C. Welle, Anastasia Varava, Hang Yin, Alessandro Marino, Danica Kragic• 2021

Related benchmarks

TaskDatasetResultRank
Visual Task Planning and Graph LearningFruit-2x3
Optimality Score100
3
Visual Task Planning and Graph LearningMulti Fruit 2x3
Opt40
3
Visual Task Planning and Graph LearningBlocks 2
Opt76.7
3
Visual Task Planning and Graph LearningBlocks 3
Opt33.6
3
Visual Task Planning and Graph LearningFruit-4x6
Opt Rate37.5
3
Visual Task Planning and Graph LearningFruit 6x8
Opt52.7
3
Visual Task Planning and Graph LearningMulti Fruit 4x6
Opt12
3
Showing 7 of 7 rows

Other info

Follow for update