DiMSam: Diffusion Models as Samplers for Task and Motion Planning under Partial Observability

About

Generative models such as diffusion models, excel at capturing high-dimensional distributions with diverse input modalities, e.g. robot trajectories, but are less effective at multi-step constraint reasoning. Task and Motion Planning (TAMP) approaches are suited for planning multi-step autonomous robot manipulation. However, it can be difficult to apply them to domains where the environment and its dynamics are not fully known. We propose to overcome these limitations by composing diffusion models using a TAMP system. We use the learned components for constraints and samplers that are difficult to engineer in the planning model, and use a TAMP solver to search for the task plan with constraint-satisfying action parameter values. To tractably make predictions for unseen objects in the environment, we define the learned samplers and TAMP operators on learned latent embedding of changing object states. We evaluate our approach in a simulated articulated object manipulation domain and show how the combination of classical TAMP, generative modeling, and latent embedding enables multi-step constraint-based reasoning. We also apply the learned sampler in the real world. Website: https://sites.google.com/view/dimsam-tamp

Xiaolin Fang, Caelan Reed Garrett, Clemens Eppner, Tom\'as Lozano-P\'erez, Leslie Pack Kaelbling, Dieter Fox• 2023

Related benchmarks

Task	Dataset	Result	Rank
Image-to-Image Translation	summer-winter Global 512x512	FID101.3		12
Image-to-Image Translation	horse-zebra Local 512x512	FID108.7		11

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord