Semantically Labelled Automata for Multi-Task Reinforcement Learning with LTL Instructions
About
We study multi-task reinforcement learning (RL), a setting in which an agent learns a single, universal policy capable of generalising to arbitrary, possibly unseen tasks. We consider tasks specified as linear temporal logic (LTL) formulae, which are commonly used in formal methods to specify properties of systems, and have recently been successfully adopted in RL. In this setting, we present a novel task embedding technique leveraging a new generation of semantic LTL-to-automata translations, originally developed for temporal synthesis. The resulting semantically labelled automata contain rich, structured information in each state that allow us to (i) compute the automaton efficiently on-the-fly, (ii) extract expressive task embeddings used to condition the policy, and (iii) naturally support full LTL. Experimental results in a variety of domains demonstrate that our approach achieves state-of-the-art performance and is able to scale to complex specifications where existing methods fail.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| LTL Instruction Following | Letter Finite-horizon (full) | Success Rate (SR)100 | 19 | |
| LTL Instruction Following | ZoneEnv Finite Horizon | Success Rate (SR)96 | 18 | |
| LTL Instruction Following | Zones Infinite-horizon (full) | µacc18.65 | 14 | |
| LTL Instruction Following | LetterWorld Finite-horizon | Success Rate (SR)100 | 12 | |
| LTL Instruction Following | Letter Infinite-horizon (full) | µAcc7.13 | 10 | |
| LTL-guided Reinforcement Learning | Zones Finite-horizon (test) | Success Rate98 | 10 | |
| LTL-guided Reinforcement Learning | Letter Finite-horizon (test) | Success Rate (SR)100 | 9 | |
| LTL-guided Reinforcement Learning | Zones Infinite-horizon (test) | µacc18.65 | 7 | |
| LTL-guided Reinforcement Learning | Letter Infinite-horizon (test) | µAcc7.13 | 5 | |
| LTL Instruction Following | LetterWorld Infinite-horizon | µAcc11.67 | 4 |