Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Accelerating Policy Synthesis in Large-Scale MDPs via Hierarchical Adaptive Refinement

About

Software-intensive systems, such as software product lines and robotics, utilise Markov decision processes (MDPs) to capture uncertainty and analyse sequential decision-making problems. Despite the usefulness of conventional policy synthesis methods, they fail to scale to large state spaces. Our approach addresses this issue and accelerates policy synthesis in large MDPs by dynamically refining the MDP and iteratively selecting the most fragile MDP regions for refinement. This iterative procedure offers a balance between accuracy and efficiency, as refinement occurs only when necessary. We formally show that the composed policy is near-optimal under standard assumptions, with error bounded by the local solver tolerance and boundary mismatch. Across diverse case studies and MDPs up to 1M states, we demonstrate that our approach achieves up to $2\times$ speedup over PRISM, offering a competitive solution for real-world policy synthesis in large MDPs.

Alexandros Evangelidis, Gricel V\'azquez, Simos Gerasimou• 2025

Related benchmarks

TaskDatasetResultRank
Policy SynthesisWarehouse MDPs Property: Rmin R{“steps”}min=? [F “goal”] W512/W1024
Time (s)12.49
30
Policy SynthesisWarehouse MDPs Property: Pmax F “goal” W512 W1024
Time (s)7.61
30
Policy Synthesisarenas Rmin 16
Execution Time (s)152.8
2
Policy Synthesisarenas8 Rmin
Execution Time (s)2.94
2
Policy Synthesiswlan3 Pmax
Time (s)5.28
2
Policy Synthesisfirewire Rmin
Execution Time (s)0.187
2
Policy Synthesisconsensus Pmax
Time (s)5.414
2
Policy Synthesiszeroconf Rmin
Execution Time (s)0.224
2
Showing 8 of 8 rows

Other info

Follow for update