Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning

About

Model-based reinforcement learning (RL) enjoys several benefits, such as data-efficiency and planning, by learning a model of the environment's dynamics. However, learning a global model that can generalize across different dynamics is a challenging task. To tackle this problem, we decompose the task of learning a global dynamics model into two stages: (a) learning a context latent vector that captures the local dynamics, then (b) predicting the next state conditioned on it. In order to encode dynamics-specific information into the context latent vector, we introduce a novel loss function that encourages the context latent vector to be useful for predicting both forward and backward dynamics. The proposed method achieves superior generalization ability across various simulated robotics and control tasks, compared to existing RL schemes.

Kimin Lee, Younggyo Seo, Seunghyun Lee, Honglak Lee, Jinwoo Shin• 2020

Related benchmarks

Task	Dataset	Result
Reinforcement Learning	CartPole Pure	Average Reward (2/0.5)171.1	30
Reinforcement Learning	MountainCar (Pure)	Avg Reward (gamma=0.01)-55.23	30
Reinforcement Learning	HalfCheetah Random	--	13
Reinforcement Learning	HalfCheetah Pure-8-20	Average Reward (0.3/1.7)1.08e+3	10
Reinforcement Learning	HalfCheetah (Pure-8-40)	Average Reward (0.3/1.7)1.50e+3	10
Reinforcement Learning	HalfCheetah Pure	Average Reward (0.3/1.7)481.1	10
Reinforcement Learning	MountainCar (Random)	Avg Reward (gamma=0.01)-62.57	10
Reinforcement Learning	CartPole Pure-8-40	Average Reward (EpLen=2, DF=0.5)160.6	10
Robotic Manipulation	Sawyer-Peg raw pixels (test)	Best Avg Return4.04	5

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord