How to Mitigate the Distribution Shift Problem in Robotics Control: A Robust and Adaptive Approach Based on Offline to Online Imitation Learning

About

Distribution shift in imitation learning refers to the problem that the agent cannot plan proper actions for a state that has not been visited during the training. This problem can be largely attributed to the inherently narrow state-action coverage provided by expert demonstrations over the full environment. In this paper, we propose a robust offline to adaptive online imitation learning framework that handles the distribution shift problem in a lifelong, multi-phase scheme. In the offline learning phase, we leverage supplementary demonstrations to broaden the state-action coverage of the policy by utilizing a discriminator to effectively train the policy with supplementary demonstrations, thereby enhancing the robustness of the policy to distribution shift. In the subsequent online inference phase, our framework detects the occurrence of distribution shift and conducts self-supervised imitation learning from online experiences to adapt the policy to the online environments. Through extensive evaluations in MuJoCo environments, we demonstrate that our method exhibits better robustness to distribution shift and better adaptation performance to online environments than the baseline algorithms, which indicates superior performance of our framework against the distribution shift.

Hyung-Suk Yoon, Seung-Woo Seo• 2026

Related benchmarks

Task	Dataset	Result
Imitation Learning	D4RL HalfCheetah	D4RL Score93.28	16
Imitation Learning	D4RL walker2d	D4RL Score108.8	16
Imitation Learning	D4RL Ant	D4RL Score92.57	16
Imitation Learning	D4RL hopper	D4RL Score94.48	16

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord