Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning

About

Model-based reinforcement learning (RL) enjoys several benefits, such as data-efficiency and planning, by learning a model of the environment's dynamics. However, learning a global model that can generalize across different dynamics is a challenging task. To tackle this problem, we decompose the task of learning a global dynamics model into two stages: (a) learning a context latent vector that captures the local dynamics, then (b) predicting the next state conditioned on it. In order to encode dynamics-specific information into the context latent vector, we introduce a novel loss function that encourages the context latent vector to be useful for predicting both forward and backward dynamics. The proposed method achieves superior generalization ability across various simulated robotics and control tasks, compared to existing RL schemes.

Kimin Lee, Younggyo Seo, Seunghyun Lee, Honglak Lee, Jinwoo Shin• 2020

Related benchmarks

TaskDatasetResultRank
Reinforcement LearningCartPole Pure
Average Reward (2/0.5)171.1
30
Reinforcement LearningMountainCar (Pure)
Avg Reward (gamma=0.01)-55.23
30
Reinforcement LearningHalfCheetah Random
Average Reward (0.3/1.7)754.6
10
Reinforcement LearningHalfCheetah Pure-8-20
Average Reward (0.3/1.7)1.08e+3
10
Reinforcement LearningHalfCheetah (Pure-8-40)
Average Reward (0.3/1.7)1.50e+3
10
Reinforcement LearningHalfCheetah Pure
Average Reward (0.3/1.7)481.1
10
Reinforcement LearningMountainCar (Random)
Avg Reward (gamma=0.01)-62.57
10
Reinforcement LearningCartPole Pure-8-40
Average Reward (EpLen=2, DF=0.5)160.6
10
Robotic ManipulationSawyer-Peg raw pixels (test)
Best Avg Return4.04
5
Showing 9 of 9 rows

Other info

Follow for update