Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation

About

Existing research studies on vision and language grounding for robot navigation focus on improving model-free deep reinforcement learning (DRL) models in synthetic environments. However, model-free DRL models do not consider the dynamics in the real-world environments, and they often fail to generalize to new scenes. In this paper, we take a radical approach to bridge the gap between synthetic studies and real-world practices---We propose a novel, planned-ahead hybrid reinforcement learning model that combines model-free and model-based reinforcement learning to solve a real-world vision-language navigation task. Our look-ahead module tightly integrates a look-ahead policy model with an environment model that predicts the next state and the reward. Experimental results suggest that our proposed method significantly outperforms the baselines and achieves the best on the real-world Room-to-Room dataset. Moreover, our scalable method is more generalizable when transferring to unseen environments.

Xin Wang, Wenhan Xiong, Hongmin Wang, William Yang Wang• 2018

Related benchmarks

TaskDatasetResultRank
Vision-and-Language NavigationR2R (val unseen)
Success Rate (SR)25
260
Vision-Language NavigationR2R (test unseen)
SR25
122
Vision-Language NavigationR2R (val seen)
Success Rate (SR)43
120
Vision-Language NavigationR2R Unseen (test)
SR25
116
Vision-and-Language NavigationRoom-to-Room (R2R) Unseen (val)
SR25
52
Vision-and-Language NavigationR2R (test)
SPL (Success weighted Path Length)23
38
Vision-and-Language NavigationRoom-to-Room (R2R) Seen (val)
NE (Navigation Error)5.56
32
Vision-and-Language NavigationRoom-to-Room (R2R) (test unseen)
SR25
24
Vision-Language NavigationR2R VLN Challenge Leaderboard (test)
PL9.15
16
Vision-and-Language NavigationRoom-to-Room (R2R) unseen Leaderboard (test)
NL9.15
13
Showing 10 of 12 rows

Other info

Follow for update