Vectorized Online POMDP Planning

About

Planning under partial observability is an essential capability of autonomous robots. The Partially Observable Markov Decision Process (POMDP) provides a powerful framework for planning under partial observability problems, capturing the stochastic effects of actions and the limited information available through noisy observations. POMDP solving could benefit tremendously from massive parallelization on today's hardware, but parallelizing POMDP solvers has been challenging. Most solvers rely on interleaving numerical optimization over actions with the estimation of their values, which creates dependencies and synchronization bottlenecks between parallel processes that can offset the benefits of parallelization. In this paper, we propose Vectorized Online POMDP Planner (VOPP), a novel parallel online solver that leverages a recent POMDP formulation which analytically solves part of the optimization component, leaving numerical computations to consist of only estimation of expectations. VOPP represents all data structures related to planning as a collection of tensors, and implements all planning steps as fully vectorized computations over this representation. The result is a massively parallel online solver with no dependencies or synchronization bottlenecks between concurrent processes. Experimental results indicate that VOPP is at least $20\times$ more efficient in computing near-optimal solutions compared to an existing state-of-the-art parallel online solver. Moreover, VOPP outperforms state-of-the-art sequential online solvers, while using a planning budget that is $1000\times$ smaller.

Marcus Hoerger, Muhammad Sudrajat, Hanna Kurniawati• 2025

Related benchmarks

Task	Dataset	Result
Multi-Agent Rock Sample (POMDP)	MARS (20, 20)	Average Discounted Reward58.8	18
Robot navigation	Navigation	Average Total Discounted Reward11.9	16
Goal Navigation	Navigation problem	Path Length (Steps)19.8	4
Rock Sampling	MARS (20, 20)	Success Rate90	4
Multi-Agent Rock Sample (POMDP)	MARS 50, 50	Average Total Discounted Reward45.1	1
Rock Sampling	MARS 50, 50	Success Rate84.1	1

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord