Compositional Diffusion with Guided Search for Long-Horizon Planning
About
Generative models have emerged as powerful tools for planning, with compositional approaches offering particular promise for modeling long-horizon task distributions by composing together local, modular generative models. This compositional paradigm spans diverse domains, from multi-step manipulation planning to panoramic image synthesis to long video generation. However, compositional generative models face a critical challenge: when local distributions are multimodal, existing composition methods average incompatible modes, producing plans that are neither locally feasible nor globally coherent. We propose Compositional Diffusion with Guided Search (CDGS), which addresses this mode averaging problem by embedding search directly within the diffusion denoising process. Our method explores diverse combinations of local modes through population-based sampling, prunes infeasible candidates using likelihood-based filtering, and enforces global consistency through iterative resampling between overlapping segments. CDGS matches oracle performance on seven robot manipulation tasks, outperforming baselines that lack compositionality or require long-horizon training data. The approach generalizes across domains, enabling coherent text-guided panoramic images and long videos through effective local-to-global message passing. More details: https://cdgsearch.github.io/
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Robotic Planning | OGBench PointMaze Giant 48 (stitch) | Success Rate82 | 8 | |
| Robotic Planning | OGBench AntMaze Giant 48 (stitch) | Success Rate84 | 8 | |
| Robotic Planning | OGBench Scene 48 (play) | Success Rate0.51 | 8 | |
| Task and Motion Planning | TAMP Rearrangement Push Task 1 Length 4 | Success Rate84 | 8 | |
| Task and Motion Planning | TAMP Rearrangement Memory Task 1 Length 4 | Success Rate42 | 8 | |
| Task and Motion Planning | TAMP Rearrangement Memory Task 2 Length 7 | Success Rate18 | 8 | |
| Task and Motion Planning | TAMP Hook Reach Task 1, Length 4 | Success Rate64 | 8 | |
| Task and Motion Planning | TAMP Hook Reach Task 2 Length 5 | Success Rate58 | 8 | |
| Task and Motion Planning | TAMP Rearrangement Push Task 2, Length 7 | Success Rate0.48 | 8 | |
| Long Video Generation | VBench | Subject Consistency91.67 | 5 |