Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HSVI can solve zero-sum Partially Observable Stochastic Games

About

State-of-the-art methods for solving 2-player zero-sum imperfect information games rely on linear programming or regret minimization, though not on dynamic programming (DP) or heuristic search (HS), while the latter are often at the core of state-of-the-art solvers for other sequential decision-making problems. In partially observable or collaborative settings (e.g., POMDPs and Dec- POMDPs), DP and HS require introducing an appropriate statistic that induces a fully observable problem as well as bounding (convex) approximators of the optimal value function. This approach has succeeded in some subclasses of 2-player zero-sum partially observable stochastic games (zs- POSGs) as well, but how to apply it in the general case still remains an open question. We answer it by (i) rigorously defining an equivalent game to work with, (ii) proving mathematical properties of the optimal value function that allow deriving bounds that come with solution strategies, (iii) proposing for the first time an HSVI-like solver that provably converges to an $\epsilon$-optimal solution in finite time, and (iv) empirically analyzing it. This opens the door to a novel family of promising approaches complementing those relying on linear programming or iterative methods.

Aur\'elien Delage, Olivier Buffet, Jilles S. Dibangoye, Abdallah Saffidine• 2022

Related benchmarks

TaskDatasetResultRank
Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs)adversarial-tiger l=3, 4, 5, 7, 10, 12, 14
Time500
16
Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs)mabc l=3, 4, 5, 7, 10
Time70
13
Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs)recycling l=3, 4, 5, 7, 10
Time430
13
Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs)competitive-tiger l=3, 4, 5, 7, 10
Time291
11
Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs)adversarial-tiger
Time500
7
Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs)mabc 3
Time70
4
Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs)recycling 3
Time430
4
Solving Zero-Sum Partially Observable Stochastic Games (zs-POSGs)competitive-tiger 3
Time291
4
Showing 8 of 8 rows

Other info

Follow for update