Generating Local Shields for Decentralised Partially Observable Markov Decision Processes

About

Multi-agent systems under partial observation often struggle to maintain safety because each agent's locally chosen action does not, in general, determine the resulting joint action. Shielding addresses this by filtering actions based on the current state, but most existing techniques either assume access to a shared centralised global state or employ memoryless local filters that cannot consider interaction history. We introduce a shield process algebra with guarded choice and recursion for specifying safe global behaviour in communication-free Dec-POMDP settings. From a shield process, we compile a process automaton, then a global Mealy machine as a safe joint-action filter, and finally project it to local Mealy machines whose states are belief-style subsets of the global Mealy machine states consistent with each agent's observations, and which output per-agent safe action sets. We implement the pipeline in Rust and integrate PRISM, the Probabilistic Symbolic Model Checker, to compute best- and worst-case safety probabilities independently of the agents' policies. A multi-agent path-finding case study demonstrates how different shield processes substantially reduce collisions compared to the unshielded baseline while exhibiting varying levels of expressiveness and conservatism.

Haoran Yang, Nobuko Yoshida• 2026

Related benchmarks

Task	Dataset	Result
Multi-Agent Path Finding	MAPF 4x4 grid, n=2, 9 obstacles	Collision Rate0.00e+0	7
Multi-Agent Path Finding	MAPF 4x4 grid, n=3, 9 obstacles	Collision Rate0.00e+0	5
Multi-Agent Path Finding	MAPF 3x3 grid, n=3, 3 obstacles	Collision Rate0.00e+0	3

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord