Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution

About

Natural language provides an accessible and expressive interface to specify long-term tasks for robotic agents. However, non-experts are likely to specify such tasks with high-level instructions, which abstract over specific robot actions through several layers of abstraction. We propose that key to bridging this gap between language and robot actions over long execution horizons are persistent representations. We propose a persistent spatial semantic representation method, and show how it enables building an agent that performs hierarchical reasoning to effectively execute long-term tasks. We evaluate our approach on the ALFRED benchmark and achieve state-of-the-art results, despite completely avoiding the commonly used step-by-step instructions.

Valts Blukis, Chris Paxton, Dieter Fox, Animesh Garg, Yoav Artzi• 2021

Related benchmarks

TaskDatasetResultRank
Instruction FollowingALFRED (test-unseen)
GC27.24
23
Embodied Instruction FollowingALFRED seen 1.0 (test)
GC35.79
20
Embodied Task CompletionALFRED unseen (test)
Success Rate2.45e+3
14
Embodied Task CompletionALFRED seen (test)
Success Rate (SR)25.11
14
Interactive PlanningALFRED unseen (val)
Success Rate (SR)18.28
8
Interactive PlanningALFRED (val seen)
SR29.63
6
Showing 6 of 6 rows

Other info

Follow for update