Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Neuro-symbolic Action Masking for Deep Reinforcement Learning

About

Deep reinforcement learning (DRL) may explore infeasible actions during training and execution. Existing approaches assume a symbol grounding function that maps high-dimensional states to consistent symbolic representations and a manually specified action masking techniques to constrain actions. In this paper, we propose Neuro-symbolic Action Masking (NSAM), a novel framework that automatically learn symbolic models, which are consistent with given domain constraints of high-dimensional states, in a minimally supervised manner during the DRL process. Based on the learned symbolic model of states, NSAM learns action masks that rules out infeasible actions. NSAM enables end-to-end integration of symbolic reasoning and deep policy optimization, where improvements in symbolic grounding and policy learning mutually reinforce each other. We evaluate NSAM on multiple domains with constraints, and experimental results demonstrate that NSAM significantly improves sample efficiency of DRL agent while substantially reducing constraint violations.

Shuai Han, Mehdi Dastani, Shihan Wang• 2026

Related benchmarks

TaskDatasetResultRank
Sudoku SolvingSudoku 2x2
Final Reward1.3
14
Graph ColoringGraph Coloring G1
Final Reward1
7
Graph ColoringGraph Coloring G2
Final Reward1
7
Graph ColoringGraph Coloring G3
Final Reward1
7
Graph ColoringGraph Coloring G4
Final Reward1
7
N-Queens ProblemN-Queens N=4
Final Reward1
7
N-Queens ProblemN-Queens N=10
Final Reward1
7
Sudoku SolvingSudoku 3x3
Final Reward160
7
Sudoku SolvingSudoku 4x4
Final Reward2.1
7
Sudoku SolvingSudoku 5x5
Final Reward2.7
7
Showing 10 of 15 rows

Other info

Follow for update