Situated Mapping of Sequential Instructions to Actions with Single-step Reward Observation
About
We propose a learning approach for mapping context-dependent sequential instructions to actions. We address the problem of discourse and state dependencies with an attention-based model that considers both the history of the interaction and the state of the world. To train from start and goal states without access to demonstrations, we propose SESTRA, a learning algorithm that takes advantage of single-step reward observations and immediate expected reward maximization. We evaluate on the SCONE domains, and show absolute accuracy improvements of 9.8%-25.3% across the domains over approaches that use high-level logical representations.
Alane Suhr, Yoav Artzi• 2018
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Sequential Instruction Understanding | SCONE 1.0 (test) | Score (Sce)66.4 | 6 | |
| Sequential Instruction Understanding | SCONE (dev) | Sce. Score56.1 | 3 |
Showing 2 of 2 rows