Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ACDC: Adaptive Curriculum Planning with Dynamic Contrastive Control for Goal-Conditioned Reinforcement Learning in Robotic Manipulation

About

Goal-conditioned reinforcement learning has shown considerable potential in robotic manipulation; however, existing approaches remain limited by their reliance on prioritizing collected experience, resulting in suboptimal performance across diverse tasks. Inspired by human learning behaviors, we propose a more comprehensive learning paradigm, ACDC, which integrates multidimensional Adaptive Curriculum (AC) Planning with Dynamic Contrastive (DC) Control to guide the agent along a well-designed learning trajectory. More specifically, at the planning level, the AC component schedules the learning curriculum by dynamically balancing diversity-driven exploration and quality-driven exploitation based on the agent's success rate and training progress. At the control level, the DC component implements the curriculum plan through norm-constrained contrastive learning, enabling magnitude-guided experience selection aligned with the current curriculum focus. Extensive experiments on challenging robotic manipulation tasks demonstrate that ACDC consistently outperforms the state-of-the-art baselines in both sample efficiency and final task success rate.

Xuerui Wang, Guangyu Ren, Tianhong Dai, Bintao Hu, Shuangyao Huang, Wenzhang Zhang, Hengyan Liu• 2026

Related benchmarks

TaskDatasetResultRank
Robotic Block ManipulationHandManipulateBlockFull v0
Success Rate25
10
Robotic Egg ManipulationHandManipulateEggFull v0
Success Rate69
10
Robotic Hand ReachingHandReach v0
Success Rate72
10
Robotic Pen RotationHandManipulatePenRotate v0
Success Rate28
10
Robotic Pick-and-PlaceFetchPickAndPlace v1
Success Rate100
10
Robotic PushingFetchPush v1
Success Rate100
10
Robotic ManipulationFetchPush v1
Time-to-Threshold (Epochs)8
5
Robotic ManipulationFetchPickAndPlace v1
Time to Threshold (Epochs)18
5
Robotic ManipulationHandReach v0
Cumulative Regret44.1
5
Robotic ManipulationHandManipulateEggFull v0
Cumulative Regret (R)62.8
5
Showing 10 of 12 rows

Other info

Follow for update