Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Set-Supervised Diffusion Policy: Learning Action-Chunking Diffusion through Corrections

About

Diffusion policies have recently emerged as a powerful framework for robotic manipulation. However, like other behavior cloning methods, they remain vulnerable to distributional shift, often requiring human-in-the-loop interventions to correct failures during deployment. These interactions naturally provide paired supervision in the form of the robot's undesired actions and the human teacher's corrective actions. Yet existing data aggregation pipelines and standard behavior cloning losses largely ignore this negative signal from undesired actions, leading to overfitting to teacher's actions and an increasing reliance on costly expert data. To address this limitation, we propose Set-Supervised Diffusion Policy (SDP), a novel learning framework that utilizes contrastive action-chunk data to train diffusion policies from human corrections. From paired positive and negative action-chunks, SDP constructs a set of desired action-chunks and designs a training pipeline that encourages the diffusion policy to align with the set. Through extensive experiments across multiple robotic manipulation tasks, we demonstrate that SDP consistently improves policy performance, with particularly strong gains in robustness to noisy data. Moreover, SDP induces high-quality aggregated datasets, enabling more efficient and reliable policy learning from human-in-the-loop corrections. Our code is available at https://set-supervised-diffusion-policy.github.io/.

Zhaoting Li, Gang Chen, Javier Alonso-Mora, Cosimo Della Santina, Jens Kober• 2026

Related benchmarks

TaskDatasetResultRank
Interactive LearningPush-T Accurate
Success Rate72.9
6
Interactive LearningSquare Accurate
Success Rate98.8
6
Interactive LearningPickCan Accurate
Success Rate99.8
6
Interactive LearningTwoArmLift Accurate
Success Rate95.7
6
Interactive LearningPush-T Noisy
Success Rate60
6
Interactive LearningSquare Noisy
Success Rate94.6
6
Interactive LearningTwoArmLift Noisy
Success Rate92.4
6
Interactive LearningPickCan Noisy
Success Rate99
6
Robot ManipulationInsert-T (Medium)
Success Rate39
4
Robot ManipulationInsert-T Hard
Success Rate35
4
Showing 10 of 10 rows

Other info

Follow for update