MRS: Multi-Resolution Skills for HRL Agents
About
Hierarchical reinforcement learning (HRL) decomposes the policy into a manager and a worker, enabling long-horizon planning but introducing a performance gap on tasks requiring agility. We identify a root cause: in subgoal-based HRL, the manager's goal representation is typically learned without constraints on reachability or temporal distance from the current state, preventing precise local subgoal selection. We further show that the optimal subgoal distance is both task- and state-dependent: nearby subgoals enable precise control but amplify prediction noise, while distant subgoals produce smoother motion at the cost of geometric precision. We propose Multi-Resolution Skills (MRS), which learns multiple goal-prediction modules each specialized to a fixed temporal horizon, with a jointly trained meta-controller that selects among them based on the current state. MRS consistently outperforms fixed-resolution baselines and significantly reduces the performance gap between HRL and non-HRL state-of-the-art on DeepMind Control Suite, Gym-Robotics, and long-horizon AntMaze tasks. [Project page: https://sites.google.com/view/multi-res-skills/home]
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Hopper Hop | DeepMind Control suite | Average Return511 | 8 | |
| Cheetah Run | DeepMind Control suite | Average Return866 | 8 | |
| Walker Run | DeepMind Control suite | Average Return745 | 8 | |
| Long-horizon sparse reward navigation | AntMaze Medium | Cumulative Episodic Rewards2.30e+3 | 4 | |
| Long-horizon sparse reward navigation | AntMaze Large | Cumulative Episodic Rewards2.22e+3 | 4 | |
| quadruped_run | DeepMind Control Suite (DMC) | Cumulative Episodic Reward925 | 4 | |
| cartpole_swingup | DeepMind Control Suite (DMC) | Cumulative Reward842 | 4 | |
| pendulum_swingup | DeepMind Control Suite (DMC) | Cumulative Episodic Reward526 | 4 | |
| fetch_pick_place | Gymnasium Robotics | Cumulative Episodic Reward29 | 4 | |
| fetch_push | Gymnasium Robotics | Cumulative Reward36 | 4 |