Set-Supervised Diffusion Policy: Learning Action-Chunking Diffusion through Corrections

About

Diffusion policies have recently emerged as a powerful framework for robotic manipulation. However, like other behavior cloning methods, they remain vulnerable to distributional shift, often requiring human-in-the-loop interventions to correct failures during deployment. These interactions naturally provide paired supervision in the form of the robot's undesired actions and the human teacher's corrective actions. Yet existing data aggregation pipelines and standard behavior cloning losses largely ignore this negative signal from undesired actions, leading to overfitting to teacher's actions and an increasing reliance on costly expert data. To address this limitation, we propose Set-Supervised Diffusion Policy (SDP), a novel learning framework that utilizes contrastive action-chunk data to train diffusion policies from human corrections. From paired positive and negative action-chunks, SDP constructs a set of desired action-chunks and designs a training pipeline that encourages the diffusion policy to align with the set. Through extensive experiments across multiple robotic manipulation tasks, we demonstrate that SDP consistently improves policy performance, with particularly strong gains in robustness to noisy data. Moreover, SDP induces high-quality aggregated datasets, enabling more efficient and reliable policy learning from human-in-the-loop corrections. Our code is available at https://set-supervised-diffusion-policy.github.io/.

Zhaoting Li, Gang Chen, Javier Alonso-Mora, Cosimo Della Santina, Jens Kober• 2026

Related benchmarks

Task	Dataset	Result
Interactive Learning	Push-T Accurate	Success Rate72.9	6
Interactive Learning	Square Accurate	Success Rate98.8	6
Interactive Learning	PickCan Accurate	Success Rate99.8	6
Interactive Learning	TwoArmLift Accurate	Success Rate95.7	6
Interactive Learning	Push-T Noisy	Success Rate60	6
Interactive Learning	Square Noisy	Success Rate94.6	6
Interactive Learning	TwoArmLift Noisy	Success Rate92.4	6
Interactive Learning	PickCan Noisy	Success Rate99	6
Robot Manipulation	Insert-T (Medium)	Success Rate39	4
Robot Manipulation	Insert-T Hard	Success Rate35	4

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord