Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Robust Imitation Learning against Variations in Environment Dynamics

About

In this paper, we propose a robust imitation learning (IL) framework that improves the robustness of IL when environment dynamics are perturbed. The existing IL framework trained in a single environment can catastrophically fail with perturbations in environment dynamics because it does not capture the situation that underlying environment dynamics can be changed. Our framework effectively deals with environments with varying dynamics by imitating multiple experts in sampled environment dynamics to enhance the robustness in general variations in environment dynamics. In order to robustly imitate the multiple sample experts, we minimize the risk with respect to the Jensen-Shannon divergence between the agent's policy and each of the sample experts. Numerical results show that our algorithm significantly improves robustness against dynamics perturbations compared to conventional IL baselines.

Jongseong Chae, Seungyul Han, Whiyoung Jung, Myungsik Cho, Sungho Choi, Youngchul Sung• 2022

Related benchmarks

TaskDatasetResultRank
Inverse Reinforcement LearningAnt Leg 1,2 disabled (Source)
Mean Cumulative Rewards2.68e+3
6
Inverse Reinforcement LearningAnt Leg 0,3 disabled (Source)
Mean Cumulative Rewards2.71e+3
6
Inverse Reinforcement LearningAnt Leg 1,3 disabled (Target)
Mean Cumulative Reward2.19e+3
6
Inverse Reinforcement LearningHalfCheetah rear disabled (Source)
Mean Cumulative Reward4.27e+3
6
Inverse Reinforcement LearningHalfCheetah front disabled (Source)
Mean Cumulative Reward4.13e+3
6
Inverse Reinforcement LearningHalfCheetah no disability (Target)
Mean Cumulative Reward4.06e+3
6
Inverse Reinforcement LearningAnt Leg 0,2 disabled (Target)
Mean Cumulative Reward2.19e+3
6
Inverse Reinforcement LearningHalf Cheetah (Target)
Mean Cumulative Reward4.62e+3
6
Showing 8 of 8 rows

Other info

Follow for update