An $\epsilon$-Optimal Sequential Approach for Solving zs-POSGs
About
While recent reductions of zero-sum partially observable stochastic games (zs-POSGs) to transition-independent stochastic games (TI-SGs) theoretically admit dynamic programming, practical solutions remain stifled by the inherent non-linearity and exponential complexity of the simultaneous minimax backup. In this work, we surmount this computational barrier by rigorously recasting the simultaneous interaction as a sequential decision process via the principle of separation. We introduce distinct sufficient statistics for valuation and execution, the sequential occupancy state and the private occupancy family, which reveal a latent geometry in the optimal value function. This structural insight allows us to linearise the backup operator, reducing the update complexity from exponential to polynomial while enabling the direct extraction of safe policies without heuristic bookkeeping. Experimental results demonstrate that algorithms leveraging this sequential framework significantly outperform state-of-the-art methods, effectively rendering previously intractable domains solvable.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs) | adversarial-tiger l=3, 4, 5, 7, 10, 12, 14 | Time0.15 | 16 | |
| Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs) | mabc l=3, 4, 5, 7, 10 | Time0.85 | 13 | |
| Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs) | recycling l=3, 4, 5, 7, 10 | Time27 | 13 | |
| Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs) | competitive-tiger l=3, 4, 5, 7, 10 | Time-0.23 | 11 | |
| Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs) | adversarial-tiger | Time0.15 | 7 | |
| Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs) | Kuhn Poker | Time0.2 | 6 | |
| Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs) | mabc 3 | Time0.85 | 4 | |
| Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs) | recycling 3 | Time27 | 4 | |
| Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs) | competitive-tiger 3 | Time48 | 4 | |
| Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs) | adversarial-tiger | Time10 | 3 |