CAPE: Context-Aware Diffusion Policy Via Proximal Mode Expansion for Collision Avoidance

About

In robotics, diffusion models can capture multi-modal trajectories from demonstrations, making them a transformative approach in imitation learning. However, achieving optimal performance following this regiment requires a large-scale dataset, which is costly to obtain, especially for challenging tasks, such as collision avoidance. In those tasks, generalization at test time demands coverage of many obstacles types and their spatial configurations, which are impractical to acquire purely via data. To remedy this problem, we propose Context-Aware diffusion policy via Proximal mode Expansion (CAPE), a framework that expands trajectory distribution modes with context-aware prior and guidance at inference via a novel prior-seeded iterative guided refinement procedure. The framework generates an initial trajectory plan and executes a short prefix trajectory, and then the remaining trajectory segment is perturbed to an intermediate noise level, forming a trajectory prior. Such a prior is context-aware and preserves task intent. Repeating the process with context-aware guided denoising iteratively expands mode support to allow finding smoother, less collision-prone trajectories. For collision avoidance, CAPE expands trajectory distribution modes with collision-aware context, enabling the sampling of collision-free trajectories in previously unseen environments while maintaining goal consistency. We evaluate CAPE on diverse manipulation tasks in cluttered unseen simulated and real-world settings and show up to 26% and 80% higher success rates respectively compared to SOTA methods, demonstrating better generalization to unseen environments.

Rui Heng Yang, Xuan Zhao, Leo Maxime Brunswic, Montgomery Alban, Mateo Clemente, Tongtong Cao, Jun Jin, Amir Rasouli• 2025

Related benchmarks

Task	Dataset	Result
Collision Avoidance	ENV1: CONCEPT Full observation	Success Rate0.94	4
Collision Avoidance	ENV2 EASY (Full observation)	Success Rate (SR)98	4
Collision Avoidance	ENV3 MEDIUM Full observation	SR82	4
Collision Avoidance	ENV4 HARD (Full observation)	Success Rate0.75	4
Collision Avoidance	ENV3 MEDIUM (Limited observation)	Success Rate76	4
Pick Tape	Real-world environment	Success Rate80	3
Pick and Place Cup	Real-world environment	SR100	3

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord