Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Improving Generalization in Meta-RL with Imaginary Tasks from Latent Dynamics Mixture

About

The generalization ability of most meta-reinforcement learning (meta-RL) methods is largely limited to test tasks that are sampled from the same distribution used to sample training tasks. To overcome the limitation, we propose Latent Dynamics Mixture (LDM) that trains a reinforcement learning agent with imaginary tasks generated from mixtures of learned latent dynamics. By training a policy on mixture tasks along with original training tasks, LDM allows the agent to prepare for unseen test tasks during training and prevents the agent from overfitting the training tasks. LDM significantly outperforms standard meta-RL methods in test returns on the gridworld navigation and MuJoCo tasks where we strictly separate the training task distribution and the test task distribution.

Suyoung Lee, Sae-Young Chung• 2021

Related benchmarks

TaskDatasetResultRank
PushMeta-World ML-1 (test)
Success Rate0.83
12
PushMetaWorld ML1 Push-OOD-Extra (extrapolation)
Average Success Rate72
9
ReachMetaWorld ML1 Reach-OOD (interpolation)
Average Success Rate87
9
ReachMetaWorld ML1 Reach
Average Success Rate76
9
ReachMetaWorld ML1 Reach-OOD-Extra (extrapolation)
Success Rate79
9
PushMetaWorld ML1 Push OOD (interpolation)
Average Success Rate77
9
Reinforcement LearningHalf-cheetah-velocity (train)
Runtime (hours)31
7
Showing 7 of 7 rows

Other info

Code

Follow for update