ACDC: Adaptive Curriculum Planning with Dynamic Contrastive Control for Goal-Conditioned Reinforcement Learning in Robotic Manipulation
About
Goal-conditioned reinforcement learning has shown considerable potential in robotic manipulation; however, existing approaches remain limited by their reliance on prioritizing collected experience, resulting in suboptimal performance across diverse tasks. Inspired by human learning behaviors, we propose a more comprehensive learning paradigm, ACDC, which integrates multidimensional Adaptive Curriculum (AC) Planning with Dynamic Contrastive (DC) Control to guide the agent along a well-designed learning trajectory. More specifically, at the planning level, the AC component schedules the learning curriculum by dynamically balancing diversity-driven exploration and quality-driven exploitation based on the agent's success rate and training progress. At the control level, the DC component implements the curriculum plan through norm-constrained contrastive learning, enabling magnitude-guided experience selection aligned with the current curriculum focus. Extensive experiments on challenging robotic manipulation tasks demonstrate that ACDC consistently outperforms the state-of-the-art baselines in both sample efficiency and final task success rate.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Robotic Block Manipulation | HandManipulateBlockFull v0 | Success Rate25 | 10 | |
| Robotic Egg Manipulation | HandManipulateEggFull v0 | Success Rate69 | 10 | |
| Robotic Hand Reaching | HandReach v0 | Success Rate72 | 10 | |
| Robotic Pen Rotation | HandManipulatePenRotate v0 | Success Rate28 | 10 | |
| Robotic Pick-and-Place | FetchPickAndPlace v1 | Success Rate100 | 10 | |
| Robotic Pushing | FetchPush v1 | Success Rate100 | 10 | |
| Robotic Manipulation | FetchPush v1 | Time-to-Threshold (Epochs)8 | 5 | |
| Robotic Manipulation | FetchPickAndPlace v1 | Time to Threshold (Epochs)18 | 5 | |
| Robotic Manipulation | HandReach v0 | Cumulative Regret44.1 | 5 | |
| Robotic Manipulation | HandManipulateEggFull v0 | Cumulative Regret (R)62.8 | 5 |