Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Synthesizing POMDP Policies: Sampling Meets Model-checking via Learning

About

Partially Observable Markov Decision Processes (POMDPs) are the standard framework for decision-making under uncertainty. While sampling-based methods scale well, they lack formal correctness guarantees, making them unsuitable for safety-critical applications. Conversely, formal synthesis techniques provide correctness-by-construction but often struggle with scalability, as general POMDP synthesis is undecidable. To bridge this gap, we propose a synthesis framework that integrates sampling, automata learning, and model-checking. Inspired by Angluin's $L^*$ algorithm, our approach utilizes sampling as a membership oracle and model-checking as an equivalence oracle. This enables the synthesis of finite-state controllers with formal guarantees, provided the sampling-induced policy is regular. We establish a relative completeness result for this framework. Experimental results from our prototypical implementation demonstrate that this method successfully solves threshold-safety problems that remain challenging for existing formal synthesis tools. We believe our algorithm serves as a valuable component in a portfolio approach to tackling the inherent difficulty of POMDP synthesis problems.

Debraj Chakraborty, Anirban Majumdar, Prince Mathew, Sayan Mukherjee, Jean-Fran\c{c}ois Raskin• 2026

Related benchmarks

TaskDatasetResultRank
Policy SynthesisGrid World
Success Rate5
16
POMDP Policy Synthesischeese-maze
Execution Time2.83
4
POMDP Policy Synthesisrefuel N=6, E=8
Execution Time1.46
4
POMDP Policy SynthesisHallway
Execution Time49.32
4
POMDP Policy Synthesiscards-removed 2
Time1.48
4
POMDP Policy Synthesiscards-removed 3
Execution Time2.1
4
POMDP Policy Synthesiscards removed-4
Time7.42
4
POMDP Policy Synthesiscards-added 2
Execution Time1.77
4
POMDP Policy Synthesiscards-added-3
Execution Time10.66
4
POMDP Policy Synthesischeese-maze-det
Execution Time1.52
4
Showing 10 of 16 rows

Other info

Follow for update