STAP: Sequencing Task-Agnostic Policies

About

Advances in robotic skill acquisition have made it possible to build general-purpose libraries of learned skills for downstream manipulation tasks. However, naively executing these skills one after the other is unlikely to succeed without accounting for dependencies between actions prevalent in long-horizon plans. We present Sequencing Task-Agnostic Policies (STAP), a scalable framework for training manipulation skills and coordinating their geometric dependencies at planning time to solve long-horizon tasks never seen by any skill during training. Given that Q-functions encode a measure of skill feasibility, we formulate an optimization problem to maximize the joint success of all skills sequenced in a plan, which we estimate by the product of their Q-values. Our experiments indicate that this objective function approximates ground truth plan feasibility and, when used as a planning objective, reduces myopic behavior and thereby promotes long-horizon task success. We further demonstrate how STAP can be used for task and motion planning by estimating the geometric feasibility of skill sequences provided by a task planner. We evaluate our approach in simulation and on a real robot. Qualitative results and code are made available at https://sites.google.com/stanford.edu/stap.

Christopher Agia, Toki Migimatsu, Jiajun Wu, Jeannette Bohg• 2022

Related benchmarks

Task	Dataset	Result
Task and Motion Planning	TAMP Rearrangement Push Task 2, Length 7	Success Rate0.7	8
Task and Motion Planning	TAMP Hook Reach Task 1, Length 4	Success Rate66	8
Task and Motion Planning	TAMP Hook Reach Task 2 Length 5	Success Rate70	8
Task and Motion Planning	TAMP Rearrangement Push Task 1 Length 4	Success Rate76	8
Task and Motion Planning	TAMP Rearrangement Memory Task 2 Length 7	Success Rate0.00e+0	8
Task and Motion Planning	TAMP Rearrangement Memory Task 1 Length 4	Success Rate0.00e+0	8

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord