Sample-efficient Cross-Entropy Method for Real-time Planning
About
Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.
Cristina Pinneri, Shambhuraj Sawant, Sebastian Blaes, Jan Achterhold, Joerg Stueckler, Michal Rolinek, Georg Martius• 2020
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Robotic Grasping | Simulation wide friction regime | Success Rate (%)79 | 5 | |
| Robotic Grasping | Simulation friction regime (nominal) | Success Rate (%)100 | 5 | |
| Robotic Grasping | Simulation bimodal friction regime | Success Rate (SR)42 | 5 |
Showing 3 of 3 rows