Benchmarking Actor-Critic Deep Reinforcement Learning Algorithms for Robotics Control with Action Constraints
About
This study presents a benchmark for evaluating action-constrained reinforcement learning (RL) algorithms. In action-constrained RL, each action taken by the learning system must comply with certain constraints. These constraints are crucial for ensuring the feasibility and safety of actions in real-world systems. We evaluate existing algorithms and their novel variants across multiple robotics control environments, encompassing multiple action constraint types. Our evaluation provides the first in-depth perspective of the field, revealing surprising insights, including the effectiveness of a straightforward baseline approach. The benchmark problems and associated code utilized in our experiments are made available online at github.com/omron-sinicx/action-constrained-RL-benchmark for further research and development.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Reinforcement Learning | Hopper delta=[0.2, 0.5, 0.5], kappa=2.5 v5 (test) | Return3.26e+3 | 12 | |
| Reinforcement Learning | Ant delta=[0.2^4, 0.5^4], kappa=2.5 v5 (test) | Return2.92e+3 | 12 | |
| Reinforcement Learning | Humanoid (delta=[0.8^6, 0.5^6, 0.2^5], kappa=4.0) v5 (test) | Return5.29e+3 | 12 | |
| Reinforcement Learning | HalfCheetah delta=[0.2^3, 0.5^3], kappa=2.5 v5 (test) | Return3.80e+3 | 12 | |
| Humanoid Locomotion | IsaacLab Unitree G1 Flat terrain, κ=4.0 | Return5.21e+3 | 6 | |
| Reinforcement Learning | Hopper tight heterogeneous constraints v5 (test) | Return3.26e+3 | 6 | |
| Humanoid Locomotion | IsaacLab Unitree H1 Rough terrain, κ≈2.2 | Return23.11 | 6 | |
| Reinforcement Learning | Ant tight heterogeneous constraints v5 (test) | Return2.92e+3 | 6 | |
| Reinforcement Learning | Humanoid tight heterogeneous constraints v5 (test) | Return5.29e+3 | 6 | |
| Reinforcement Learning | HalfCheetah tight heterogeneous constraints v5 (test) | Return3.64e+3 | 6 |