Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Emergent Neural Automaton Policies: Learning Symbolic Structure from Visuomotor Trajectories

About

Scaling robot learning to long-horizon tasks remains a formidable challenge. While end-to-end policies often lack the structural priors needed for effective long-term reasoning, traditional neuro-symbolic methods rely heavily on hand-crafted symbolic priors. To address the issue, we introduce ENAP (Emergent Neural Automaton Policy), a framework that allows a bi-level neuro-symbolic policy adaptively emerge from visuomotor demonstrations. Specifically, we first employ adaptive clustering and an extension of the L* algorithm to infer a Mealy state machine from visuomotor data, which serves as an interpretable high-level planner capturing latent task modes. Then, this discrete structure guides a low-level reactive residual network to learn precise continuous control via behavior cloning (BC). By explicitly modeling the task structure with discrete transitions and continuous residuals, ENAP achieves high sample efficiency and interpretability without requiring task-specific labels. Extensive experiments on complex manipulation and long-horizon tasks demonstrate that ENAP outperforms state-of-the-art (SoTA) end-to-end VLA policies by up to 27% in low-data regimes, while offering a structured representation of robotic intent (Fig. 1).

Yiyuan Pan, Xusheng Luo, Hanjiang Hu, Peiqi Yu, Changliu Liu• 2026

Related benchmarks

TaskDatasetResultRank
Complex ManipulationDualStack Cube
Success Rate98.8
8
Complex ManipulationPeg Insert
Success Rate85.6
8
Pick-&-PlaceReal-world
Success Rate94.12
6
Long-horizon TAMPCALVIN Sequential
Success Rate (3/5 Subtasks)97
5
Long-horizon TAMPCALVIN Hierarchical
3/5 Subtasks Success Rate95.5
5
Multi-Goal PushingMultiGoalPushT
Either Success Rate94
4
Hanger TaskReal-world
Success Rate94.12
2
Stack LegoReal-world
Success Rate88.24
2
Showing 8 of 8 rows

Other info

Follow for update