Diffused Task-Agnostic Milestone Planner

About

Addressing decision-making problems using sequence modeling to predict future trajectories shows promising results in recent years. In this paper, we take a step further to leverage the sequence predictive method in wider areas such as long-term planning, vision-based control, and multi-task decision-making. To this end, we propose a method to utilize a diffusion-based generative sequence model to plan a series of milestones in a latent space and to have an agent to follow the milestones to accomplish a given task. The proposed method can learn control-relevant, low-dimensional latent representations of milestones, which makes it possible to efficiently perform long-term planning and vision-based control. Furthermore, our approach exploits generation flexibility of the diffusion model, which makes it possible to plan diverse trajectories for multi-task decision-making. We demonstrate the proposed method across offline reinforcement learning (RL) benchmarks and an visual manipulation environment. The results show that our approach outperforms offline RL methods in solving long-horizon, sparse-reward tasks and multi-task problems, while also achieving the state-of-the-art performance on the most challenging vision-based manipulation benchmark.

Mineui Hong, Minjae Kang, Songhwai Oh• 2023

Related benchmarks

Task	Dataset	Result
Locomotion	D4RL walker2d-medium-expert	Normalized Score108.2	90
walker2d locomotion	D4RL walker2d medium-replay	Normalized Score79.5	78
hopper locomotion	D4RL hopper medium-replay	Normalized Score100	71
Locomotion	D4RL Walker2d medium	Normalized Score82.7	70
Locomotion	D4RL Halfcheetah medium	Normalized Score47.3	70
hopper locomotion	D4RL hopper-medium-expert	Normalized Score109.4	53
Locomotion	D4RL halfcheetah-medium-expert	Normalized Score88.2	53
hopper locomotion	D4RL Hopper medium	Normalized Score80.7	38
HalfCheetah	D4RL Medium-Replay v0	Normalized Score42.6	28
Offline Reinforcement Learning	D4RL AntMaze medium-play v2	Averaged Score89.3	4

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord