LTL2Action: Generalizing LTL Instructions for Multi-Task RL

About

We address the problem of teaching a deep reinforcement learning (RL) agent to follow instructions in multi-task environments. Instructions are expressed in a well-known formal language -- linear temporal logic (LTL) -- and can specify a diversity of complex, temporally extended behaviours, including conditionals and alternative realizations. Our proposed learning approach exploits the compositional syntax and the semantics of LTL, enabling our RL agent to learn task-conditioned policies that generalize to new instructions, not observed during training. To reduce the overhead of learning LTL semantics, we introduce an environment-agnostic LTL pretraining scheme which improves sample-efficiency in downstream environments. Experiments on discrete and continuous domains target combinatorial task sets of up to $\sim10^{39}$ unique tasks and demonstrate the strength of our approach in learning to solve (unseen) tasks, given LTL instructions.

Pashootan Vaezipoor, Andrew Li, Rodrigo Toro Icarte, Sheila McIlraith• 2021

Related benchmarks

Task	Dataset	Result
Multi-Task Reinforcement Learning (LTL Instruction Following)	Warehouse Finite Horizon	Success Rate98	30
LTL Instruction Following	Letter Finite-horizon (full)	Success Rate (SR)86	19
LTL Instruction Following	ZoneEnv Finite Horizon	Success Rate (SR)85	18
Multi-Task Reinforcement Learning (LTL Instruction Following)	ZoneEnv Finite Horizon	Success Rate84	18
LTL Instruction Following	LetterWorld Finite-horizon	Success Rate (SR)84	12
LTL-guided Reinforcement Learning	Zones Finite-horizon (test)	Success Rate74	10
LTL-guided Reinforcement Learning	Letter Finite-horizon (test)	Success Rate (SR)74	9
Global Avoidance	FlatWorld Base (test)	Avg Total Return0.002	3
Global Avoidance	FlatWorld +dep. (test)	Average Total Return-0.175	3
Global Avoidance	FlatWorld (+conj.) (test)	Average Total Return-0.152	3

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord