Generating Local Shields for Decentralised Partially Observable Markov Decision Processes
About
Multi-agent systems under partial observation often struggle to maintain safety because each agent's locally chosen action does not, in general, determine the resulting joint action. Shielding addresses this by filtering actions based on the current state, but most existing techniques either assume access to a shared centralised global state or employ memoryless local filters that cannot consider interaction history. We introduce a shield process algebra with guarded choice and recursion for specifying safe global behaviour in communication-free Dec-POMDP settings. From a shield process, we compile a process automaton, then a global Mealy machine as a safe joint-action filter, and finally project it to local Mealy machines whose states are belief-style subsets of the global Mealy machine states consistent with each agent's observations, and which output per-agent safe action sets. We implement the pipeline in Rust and integrate PRISM, the Probabilistic Symbolic Model Checker, to compute best- and worst-case safety probabilities independently of the agents' policies. A multi-agent path-finding case study demonstrates how different shield processes substantially reduce collisions compared to the unshielded baseline while exhibiting varying levels of expressiveness and conservatism.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-Agent Path Finding | MAPF 4x4 grid, n=2, 9 obstacles | Collision Rate0.00e+0 | 7 | |
| Multi-Agent Path Finding | MAPF 4x4 grid, n=3, 9 obstacles | Collision Rate0.00e+0 | 5 | |
| Multi-Agent Path Finding | MAPF 3x3 grid, n=3, 3 obstacles | Collision Rate0.00e+0 | 3 |