Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

About

One of the most challenging topics in Natural Language Processing (NLP) is visually-grounded language understanding and reasoning. Outdoor vision-and-language navigation (VLN) is such a task where an agent follows natural language instructions and navigates a real-life urban environment. Due to the lack of human-annotated instructions that illustrate intricate urban scenes, outdoor VLN remains a challenging task to solve. This paper introduces a Multimodal Text Style Transfer (MTST) learning approach and leverages external multimodal resources to mitigate data scarcity in outdoor navigation tasks. We first enrich the navigation data by transferring the style of the instructions generated by Google Maps API, then pre-train the navigator with the augmented external outdoor navigation dataset. Experimental results show that our MTST learning approach is model-agnostic, and our MTST approach significantly outperforms the baseline models on the outdoor VLN task, improving task completion rate by 8.7% relatively on the test set.

Wanrong Zhu, Xin Eric Wang, Tsu-Jui Fu, An Yan, Pradyumna Narayana, Kazoo Sone, Sugato Basu, William Yang Wang• 2020

Related benchmarks

TaskDatasetResultRank
Vision-and-Language NavigationTouchdown Seen (test)
TC14.9
13
Vision-and-Language NavigationTouchdown Unseen (test)
nDTW5.2
11
Vision-and-Language Navigationmap2seq Unseen (test)
nDTW6.1
10
Vision-and-Language Navigationmap2seq Seen (test)
nDTW29.5
10
Outdoor Vision-and-Language NavigationTOUCHDOWN (dev)
TC15
9
Outdoor Vision-and-Language NavigationTOUCHDOWN (test)
Task Completion Rate (TC)16.2
9
Vision-and-Language NavigationTouchdown seen (dev)
SDTW12.9
9
Vision-and-Language Navigationmap2seq unseen (dev)
nDTW6.2
8
Vision-and-Language Navigationmap2seq seen (dev)
SDTW17.5
7
Vision-and-Language NavigationTouchdown unseen (dev)
SDTW1.9
7
Showing 10 of 12 rows

Other info

Code

Follow for update